AWS – Learning the basics of AWS CloudFormation

Installing MySQL On CentOS 8

We’ll use CloudFormation extensively throughout this book, so it’s important that you have an understanding of what it is and how it fits into the AWS ecosystem. There should be enough information here to get you started, but, where necessary, we’ll refer you to the AWS documentation.

What is CloudFormation?

The CloudFormation  service allows you to provision and manage a collection of AWS resources in an automated and repeatable fashion. In AWS terminology, these collections are referred to as stacks. Note, however, that a stack can be as large or as small as you like. It might consist of a single S3 bucket, or it might contain everything needed to host your three-tier web app.

In this chapter, we’ll show you how to define the resources to be included in your CloudFormation stack. We’ll talk a bit more about the composition of these stacks and why and when it’s preferable to divvy up resources between a number of stacks. Finally, we’ll share a few of the tips and tricks we’ve learned over the years building countless CloudFormation stacks.

Why is CloudFormation important?

By now, the benefits of automation should be starting to become apparent to you. But don’t fall into the trap of thinking CloudFormation will only be useful for large collections of resources. Even performing the simplest task of, say, creating an S3 bucket, can get very repetitive if you need to do it in every region.

We work with a lot of customers who have very tight controls and governance around their infrastructure, especially in the network layer (think VPCs, NACLs, and security groups). Being able to express their cloud footprint in YAML (or JSON), store it in a source code repository, and funnel it through a high-visibility pipeline gives these customers confidence that their infrastructure changes are peer-reviewed and will work as expected in production. Discipline and commitment to IaC SDLC practices are, of course, a big factor in this, but CloudFormation helps bring us out of the era of following 20-page run-sheets for manual changes, navigating untracked or unexplained configuration drift, and unexpected downtime that’s caused by fat fingers.

Infrastructure as Code (IaC)

AWS CloudFormation is an Infrastructure as Code (IaC) service. IaC has emerged as a critical strategy for companies that are making the transformation to a DevOps culture. DevOps and IaC go hand in hand. The practice of storing your infrastructure as code encourages a sharing of responsibilities that facilitates collaboration.

There are many benefits to IaC, some of which are as follows:

  • Modeling your infrastructure as code gives you a single source of truth to define the resources that are deployed in your account.
  • Once there are no manual steps to create your resources, you can fully automate deployment. You can deploy changes to an existing environment or create a brand new environment from scratch automatically by launching stacks based on your CloudFormation templates.
  • Treating your infrastructure as code allows you to apply all the best practices of modern software development to your templates. Use code editors, distributed version control, code reviews, and easy rollbacks as part of your process.

The layer cake

Now is a good time to start thinking about your AWS deployments in terms of layers. Your layers will sit on top of one another, and you will have well-defined relationships between them.

Here’s a bottom-up example of what your layer cake might look like:

  • VPC with CloudTrail
  • Subnets, routes, and NACLs
  • NAT gateways, VPN or bastion hosts, and associated security groups
  • App stack 1: Security groups and S3 buckets
  • App stack 2: Cross-zone RDS and read replica
  • App stack 3: App and web server autoscaling groups and ELBs
  • App stack 4: CloudFront and WAF config

In this example, you may have many occurrences of the app stack layers inside your VPC, assuming that you have enough IP addresses in your subnets! This is often the case with VPCs living inside development environments. So, immediately, you have the benefit of multi-tenancy capability with application isolation.

One advantage of this approach is that, while you are developing your CloudFormation template, if you mess up the configuration of your app server, you don’t have to wind back all the work CloudFormation did on your behalf. You can just scrap that particular layer (and the layers that depend on it) and restart from there. This is not the case if you have everything contained in a single template.

We commonly work with customers for whom the ownership and management of each layer in the cake reflect the structure of the technology divisions within a company. The traditional infrastructure, network, and cybersecurity folk are often really interested in creating a safe place for digital teams to deploy their apps, so they like to heavily govern the foundational layers of the cake. 

Even if you are a single-person infrastructure coder working in a small team, you will benefit from this approach. For example, you’ll find that it dramatically reduces your exposure to things such as AWS limits, timeouts, and circular dependencies.

CloudFormation templates

This is where we start to get our hands dirty. CloudFormation template files are the codified representations of your stack and are expressed in either YAML or JSON. When you wish to create a CloudFormation stack, you push a template file to CloudFormation through its API, web console, command-line tools, or some other method (such as the SDK).

Templates can be replayed over and over again by CloudFormation, thus creating many instances of your stack.

YAML versus JSON

Up until recently, JSON was your only option. We actually encourage you to adopt YAML, and we’ll be using it for all of the examples that are shown in this book. Some of the reasons for this are as follows:

  • It’s just nicer to look at. It’s less syntax-heavy, and should you choose to go down the path of generating your CloudFormation templates, pretty much every language has a YAML library of some kind.
  • The size of your templates will be much smaller. This is more practical from a developer’s point of view, but it also means that you’re less likely to run into the CloudFormation size limit on template files (50 KB).
  • The string-substitution features are easier to use and interpret.
  • Your EC2 UserData (the script that runs when your EC2 instance boots) will be much easier to implement and maintain.

A closer look at CloudFormation templates

CloudFormation templates consist of a number of parts, but these are the four we’re going to concentrate on:

  • Parameters
  • Resources
  • Outputs
  • Mappings

Here’s a short YAML example:

AWSTemplateFormatVersion: '2010-09-09' 
Parameters:
  EC2KeyName: 
    Type: String 
    Description: EC2 Key Pair to launch with 
Mappings: 
  RegionMap: 
    us-east-1: 
      AMIID: ami-9be6f38c 
    ap-southeast-2: 
      AMIID: ami-28cff44b

We declare a parameter and mappings to start the template. Mappings will be covered in Chapter 10, Advanced AWS CloudFormation. Next, we define Resources:

Resources: 
  ExampleEC2Instance: 
    Type: AWS:EC2::Instance 
    Properties: 
      InstanceType: t2.nano 
      UserData: 
        Fn::Base64: 
          Fn::Sub': |
            #!/bin/bash -ex
            /opt/aws/bin/cfn-signal '${ExampleWaitHandle}' 
      ImageId: 
        Fn::FindInMap: [ RegionMap, Ref: 'AWS::Region', AMIID ]   
      KeyName: 
        Ref: EC2KeyName 

Then, in the final section of the template, we define WaitHandle, WaitCondition, and Outputs:

  ExampleWaitHandle: 
    Type: AWS::CloudFormation::WaitConditionHandle 
    Properties: 
  ExampleWaitCondition: 
    Type: AWS::CloudFormation::WaitCondition 
    DependsOn: ExampleEC2Instance 
    Properties: 
      Handle: 
        Ref: ExampleWaitHandle 
      Timeout: 600 
Outputs: 
  ExampleOutput: 
    Value: 
      Fn::GetAtt: ExampleWaitCondition.Data 
    Description: The data signaled with the WaitCondition

Outputs give you a way to see things such as auto-generated names, and, in this case, the data from the wait condition.

Parameters

CloudFormation parameters are the input values you define when creating or updating a stack, similar to how you provide parameters to any command-line tools you might use. They allow you to customize your stack without making changes to your template. Common examples of what parameters might be used for are the following:

  • EC2 AMI ID: You may wish to redeploy your stack with a new AMI that has the latest security patches installed.
  • Subnet IDs: You could have a list of subnets that an autoscaling group should deploy servers in. These subnet IDs will be different between your dev, test, and production environments.
  • Endpoint targets and credentials: These include things such as API hostnames, usernames, and passwords.

You’ll find that there are a number of parameter types. In brief, they are as follows:

  • String
  • Number
  • List
  • CommaDelimitedList

In addition to these, AWS provides some AWS-specific parameter types. These can be particularly handy when you are executing your template via the CloudFormation web console. For example, a parameter of the  AWS::EC2::AvailabilityZone::  type causes the web console to display a dropdown list of valid AZs for this parameter. In the  ap-southeast-2  region, the list would look like this:

  • ap-southeast-2a
  • ap-southeast-2b
  • ap-southeast-2c

The list of AWS-specific parameter types is steadily growing and is so long that we can’t list them here. We’ll use many of them throughout this book, however, and they can easily be found in the AWS CloudFormation documentation.

When creating or updating a stack, you will need to provide values for all the parameters you’ve defined in your template. Where it makes sense, you can define default values for a parameter. For example, you might have a parameter called  debug  that tells your application to run in debug mode. Typically, you don’t want this mode enabled by default, so you can set the default value for this parameter to  false disabled , or something else your application understands. Of course, this value can be overridden when you’re creating or updating  a stack.

You can – and should – provide a short, meaningful description for each parameter. These are displayed in the web console, next to each parameter field. When used properly, they provide hints and context to whoever is trying to run your CloudFormation template.

At this point, we need to introduce the built-in  Ref  function. When you need to reference a parameter value, you use this function to do so:

KeyName:
  Ref: EC2KeyName

While  Ref  isn’t the only built-in function you’ll need to know about, it’s almost certainly going to be the one you’ll use the most. We’ll talk more about built-in functions later in this chapter.

Resources

Resources are your actual pieces of AWS infrastructure. These are your EC2 instances, S3 buckets, ELBs, and so on. Almost any resource type you can create by pointing and clicking on the AWS web console can also be created using CloudFormation.

It’s not practical to list all the AWS resource types in this chapter. However, you will get familiar with the most common types as you work your way through the recipes in this book.

AWS has a definitive list of resources types here: http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-template-resource-type-ref.html.

There are a few important things to keep in mind about CloudFormation resources.

New or bleeding-edge AWS resources are often not immediately supported. CloudFormation support typically lags a few weeks (sometimes months) behind the release of new AWS features. This used to be quite frustrating for anyone who found that infrastructure automation was key. Fast-forward to today, and this situation is somewhat mitigated by the ability to use custom resources. These are discussed later on in this chapter.

Resources have a default return value. You can use Ref to fetch these return values for use elsewhere in your template. For example, the AWS::EC2::VPC resource type has a default return value, which is the ID of the VPC. It looks something like this: 

vpc-11aa111a

Resources often contain additional return values. These additional values are fetched using the built-in Fn::GetAtt function. Continuing from the previous example, the AWS::EC2::VPC resource type also returns the following:

  • CidrBlock
  • DefaultNetworkAcl
  • DefaultSecurityGroup
  • Ipv6CidrBlocks
  • Outputs

Just like AWS resources, CloudFormation stacks can also have return values, called outputs. These values are entirely user-defined. If you don’t specify any outputs, then nothing is returned when your stack is completed.

Outputs can come in handy when you are using a CI/CD tool to create your CloudFormation stacks. For example, you might like to output the public hostname of an ELB so that your CI/CD tool can turn it into a clickable link within the job output.

You’ll also use them when you are linking pieces of your layer cake together. You may want to reference an S3 bucket or security group that’s was created in another stack. This is much easier to do with the new cross-stack references feature, which we’ll discuss later in this chapter. You can expect to see the  Ref  and  Fn::GetAtt  functions a lot in the output section of any CloudFormation template.

Dependencies and ordering

When executing your template, CloudFormation will automatically work out which resources depend on each other and order their creation accordingly. Additionally, resource creation is parallelized as much as possible so that your stack execution finishes in the timeliest manner possible.

Let’s look at an example where an app server depends on a DB server. To connect to the database, the app server needs to know its IP address or hostname. This situation would actually require you to create the DB server first so that you can use  Ref  to fetch its IP and provide it to your app server. CloudFormation has no way of knowing about the coupling between these two resources, so it will go ahead and create them in any order it pleases (or in parallel, if possible).

To fix this situation, we use the  DependsOn  attribute to tell CloudFormation that our app server depends on our DB server. In fact,  DependsOn  can actually take a list of strings if a resource happens to depend on multiple resources before it can be created. So, if our app server were to also depend on, say, a Memcached server, then we would use  DependsOn  to declare both dependencies.

If necessary, you can take this further. Let’s say that, after your DB server boots, it will automatically start the database, set up a schema, and import a large amount of data. It may be necessary to wait for this process to complete before we create an app server that attempts to connect to a DB expecting a complete schema and dataset. In this scenario, we want a way to signal to CloudFormation that the DB server has completed its initialization so that it can go ahead and create resources that depend on it. This is where  WaitCondition  and  WaitConditionHandle  come in.

First, you create an  AWS::CloudFormation::WaitConditionHandle  type, which you can later reference via  Ref .

Next, you create an  AWS::CloudFormation::WaitCondition  type. In our case, we want the waiting period to start as soon as the DB server is created, so we specify that this  WaitCondition  resource  DependsOn  our DB server.

After the DB server has finished importing data and is ready to accept connections, it calls the callback URL provided by the  WaitConditionHandle  resource to signal to CloudFormation that it can stop waiting and start executing the rest of the CloudFormation stack. The URL is supplied to the DB server via  UserData , again using  Ref . Typically,  curl wget , or some equivalent is used to call the URL.

WaitCondition  resource can have a  Timeout  period too. This is a value that’s specified in seconds. In our example, we might supply a value of  900  because we know that it should never take more than 15 minutes to boot our DB and import the data.

Here’s an example of what  DependsOn WaitConditionHandle , and  WaitCondition  look like when combined:

ExampleWaitHandle:
  Type: AWS::CloudFormation::WaitConditionHandle
  Properties:
ExampleWaitCondition:
  Type: AWS::CloudFormation::WaitCondition
  DependsOn: ExampleEC2Instance
  Properties:
    Handle:
      Ref: ExampleWaitHandle
    Timeout: 600

Functions

CloudFormation provides some built-in functions to make composing your templates a lot easier. We’ve already looked at  Ref  and  Fn::GetAtt . Let’s look at some others you are likely to encounter.

Fn::Join

Use  Fn::Join  to concatenate a list of strings using a specified delimiter, for example:

"Fn::Join": [ ".", [ 1, 2, 3, 4 ] ]

This would yield the following value:

"1.2.3.4"

Fn::Sub

Use  Fn::Sub  to perform string substitution. Consider the following code:

DSN: "Fn::Sub"
  - mysql://${db_user}:${db_pass}@${db_host}:3306/wordpress
  - { db_user: lchan, db_pass: ch33s3, db_host: localhost }

This would yield the following value:

mysql://lchan:ch33s3@localhost:3306/wordpress

When you combine these functions with  Ref  and  Fn::GetAtt , you can start doing some really powerful stuff, as we’ll see in the recipes throughout this book.

Other available built-in functions include the following:

  • Fn::Base64
  • Fn::FindInMap
  • Fn::GetAZs
  • Fn::ImportValue
  • Fn::Select
Documentation on all of these functions is available at http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/intrinsic-function-reference.html.

Conditionals

It’s reasonably common to provide a similar but distinct set of resources based on which environment your stack is running in. In your development environment, for example, you may not wish to create an entire fleet of database servers, instead opting for just a single database server. You can achieve this by using conditionals, such as the following ones:

  • Fn::And
  • Fn::Equals
  • Fn::If
  • Fn::Not
  • Fn::Or

Permissions and service roles

One important thing to remember about CloudFormation is that it’s more or less just making API calls on your behalf. This means that CloudFormation will assume the very same permissions or role you use to execute your template. If you don’t have permission to create a new hosted zone on Route 53, for example, any template you try to run that contains a new Route 53-hosted zone will fail.

On the flip side, this has created a somewhat tricky situation where anyone developing CloudFormation typically has a very elevated level of privileges, and those privileges are somewhat unnecessarily granted to CloudFormation each time a template is executed.

If my CloudFormation template contains only one resource, which is a Route 53-hosted zone, it doesn’t make sense for that template to be executed with full admin privileges to my AWS account. It makes much more sense to give CloudFormation a very slim set of permissions to execute the template with, thus limiting the blast radius if a bad template were to be executed (that is, a bad copy-and-paste operation resulting in deleted resources).

Thankfully, you can use service roles to define an IAM role and tell CloudFormation to use that role when your stack is being executed, giving you a much safer space to play in.

Cross-stack references

When using the layered cake approach, it’s very common to want to use outputs from one stack as inputs in another stack. For example, you may create a VPC in one stack and require its VPC ID when creating resources in another.

For a long time, you needed to provide some glue around stack creation in order to pass the output between stacks. Cross-stack references provide a more native way of doing this.

You can now export one or more outputs from your stack. This makes those outputs available to other stacks. Note that the name of this value needs to be unique, so it’s probably a good idea to include the CloudFormation stack name in the name you’re exporting to achieve this.

Once a value has been exported, it becomes available to be imported in another stack using the  Fn::ImportValue  function – very handy!

Make sure, however, that during the time an exported value is being referenced, you are not able to delete or modify it. Additionally, you won’t be able to delete the stack containing the exported value. Once something is referencing an exported value, it’s there to stay until there are no stacks referencing it at all.

Updating resources

One of the principles of IaC is that all the changes should be represented as code for review and testing. This is especially important where CloudFormation is concerned.

After creating a stack for you, the CloudFormation service is effectively hands-off. If you make a change to any of the resources created by CloudFormation (in the web console, command line, or by some other method), you’re effectively causing configuration drift; CloudFormation no longer knows the exact state of the resources in your stack.

The correct approach is to make these changes in your CloudFormation template and perform an update operation on your stack. This ensures that CloudFormation always knows the state of your stack and allows you to be confident that your infrastructure code is a complete and accurate representation of your running environments.

Changesets

When performing a stack update, it can be unclear exactly what changes are going to be made to your stack. Depending on which resource you are changing, you may find that it will need to be deleted and recreated in order to implement your change. This, of course, is completely undesired behavior if the resource in question contains data you’d like to keep. Keep in mind that RDS databases can be a particular pain point.

To mitigate this situation, CloudFormation allows you to create and review a changeset prior to executing the update. The changeset shows you which operations CloudFormation intends to perform on your resources. If the changeset looks good, you can choose to proceed. If you don’t like what you see, you can delete the changeset and choose another course of action – perhaps choosing to create and switch to an entirely new stack to avoid a service outage.

Other things to know

There are a few other things you should keep in the back of your mind as you start building out your own CloudFormation stacks. Let’s take a look.

Name collisions

Often, if you omit the name attribute from a resource, CloudFormation will generate a name for you. This can result in weird-looking resource names, but it will increase the replayability of your template. Using  AWS::S3::Bucket  as an example, if you specify the  BucketName  parameter but don’t ensure its uniqueness, CloudFormation will fail to execute your template the second time around because the bucket will already exist. Omitting  BucketName  fixes this. Alternatively, you may opt to generate your own unique name each time the template is run. There’s probably no right or wrong approach here, so just do what works for you.

Rollback

When creating a CloudFormation stack, you are given the option of disabling rollback. Before you go ahead and set this to  true , keep in mind that this setting persists beyond stack creation. We’ve ended up in precarious situations where updating an existing stack has failed (for some reason) but rollback has been disabled. This is a fun situation for no one.

Limits

The limits that are the most likely to concern you are as follows:

  • The maximum size allowed for your CloudFormation template is 50 KB. This is quite generous, and if you hit this limit, you almost certainly need to think about breaking up your template into a series of smaller ones. If you absolutely need to exceed the 50 KB limit, then the most common approach is to upload your template to S3 and then provide an S3 URL to CloudFormation to execute.
  • The maximum number of parameters you can specify is 60. If you need more than this then, again, consider whether or not you need to add more layers to your cake. Otherwise, lists or mappings might get you out of trouble here.
  • Outputs are also limited to 60. If you’ve hit this limit, it’s probably time to resort to a series of smaller templates.
  • Resources are limited to 200. The same rules apply here as they do for the previous limit.
  • By default, you’re limited to a total of 200 CloudFormation stacks. You can have this limit increased simply by contacting AWS.
Use nested stacks to reduce the complexity of any one CloudFormation template, and to avoid hitting the limits we’ve described here.

Circular dependencies

Something to keep in the back of your mind is that you may run into a circular dependency scenario, where multiple resources depend on each other for creation. A common example is where two security groups reference each other in order to allow access between themselves.

A workaround for this particular scenario is to use the  AWS::EC2::SecurityGroupEgress  and  AWS::EC2::SecurityGroupIngress  types instead of the ingress and egress rule types for  AWS::EC2::SecurityGroup .

Credentials

Under no circumstances do you want to have credentials hardcoded in your templates or committed to your source code repository. Doing this doesn’t just increase the chance that your credentials will be stolen – it also reduces the portability of your templates. If your credentials are hardcoded and you need to change them, that obviously requires you to edit your CloudFormation template.

Instead, you should add credentials as parameters in your template. Be sure to use the  NoEcho  parameter when you do this so that CloudFormation masks the value anywhere the parameters are displayed.

Stack policies

If there are resources in your stack you’d like to protect from accidental deletion or modification, applying a stack policy will help you achieve this. By default, all resources can be deleted or modified. When you apply a stack policy, all the resources are protected unless you explicitly allow them to be deleted or modified in the policy. Note that stack policies do not apply during stack creation – they only take effect when you attempt to update a stack.

Comments are closed.