loading...

AWS – Caching a website with CloudFront

How to Install Windows Server 2019

In this recipe, we’ll show you how to use AWS CloudFront to cache your website.

The primary reasons you’ll want to consider doing this are as follows:

  • Copies of your content will be geographically located closer to your end users, thereby improving their experience and delivering content to them faster.
  • The burden for serving content will be removed from your fleet of servers. This could potentially result in a large cost saving if you’re able to turn off some servers or reduce your bandwidth bill.
  • You may need to be shielded from large and unexpected spikes in traffic.
  • While not the focus of this chapter, CloudFront gives you the ability to implement a Web Application Firewall (WAF) as an added layer of protection from the bad guys.
  • You can serve the contents of a private S3 bucket using TLS, which gives your site the secure HTTPS prefix.
Unlike most AWS services, which are region-specific, CloudFront is a global service.

Getting ready

First of all, you’re going to need a publicly accessible website. This could be a static website hosted in S3, or it could be a dynamically generated website hosted in EC2. In fact, your website doesn’t even need to be hosted in AWS for you to use CloudFront. As long as your website is publicly accessible, you should be good to go.

You’ll also need to have the ability to modify the DNS records for your website. Instead of pointing to your web server (or S3 bucket), we’ll eventually point them to CloudFront.

About dynamic content

If your website consists of mostly dynamic content, you can still benefit from implementing CloudFront.

First, CloudFront will maintain a pool of persistent connections with your origin servers. This lessens the time it takes for files to be served to your users because the number of three-way handshakes they’ll need to perform is reduced.

Second, CloudFront implements some additional optimizations around TCP connections for high performance. More data is able to be transferred over the wire initially because CloudFront uses a wider initial TCP window.

Finally, implementing a CDN such as CloudFront does give you the opportunity to review your caching strategy and how you use cache-control headers. If your home page is dynamically generated, you’ll get some benefit straight away by serving it via CloudFront, but the benefits will be much greater if you let CloudFront cache it for a few minutes. Again, cost, end user, and backend performance are all things you should take into consideration.

Configuring CloudFront distributions

Distributions can be configured with a fairly wide array of options. Our recipe is going to be quite simple so that you can get up and running as quickly as possible. However, we will talk about some of the more common configuration options:

  • Origins: A distribution needs to have at least one origin. An origin, as its name indicates, is where your website content originates from your public-facing website. The properties you’ll most likely be concerned with are as follows:
    • Origin Domain Name: This is the hostname of your public-facing website. The CloudFormation template we supply accepts this hostname as a parameter.
    • Origin Path: It’s possible to configure the distribution to fetch content from a directory or subfolder at the origin, for example, /content/imagesif you were using CloudFront to cache images only. In our case, we are caching our entire website, so we don’t specify an origin path at all.
    • Origin ID: This is particularly important when you are using non-default cache behavior settings, and therefore have configured multiple origins. You need to assign a unique ID to the origins so that the cache behaviors know which origin to target. There’ll be more discussion on cache behaviors later.
    • HTTP Port, HTTPS Port: If your origin is listening on nonstandard ports for HTTP or HTTPS, you would use these parameters to define those ports.
    • Origin Protocol Policy: You are able to configure the distribution to talk to your origin via the following:
      • HTTP Only
      • HTTPS Only
      • Match Viewer

The  Match Viewer option forwards requests to the origin based on which protocol the user requested in their browser. Again, we are keeping things quite simple in this recipe, so we’ll be opting for  HTTP Only.

  • Logging: Keep in mind that because less traffic will be hitting your origin, fewer access logs will also be captured. It makes sense to have CloudFront keep these logs for us in an S3 bucket. This is included in CloudFormation provided with this recipe:
    • Cache behaviors: In this recipe, we’ll configure a single (default) cache behavior, which will forward all the requests to our origin.
    • CloudFront: This allows you to get quite fine-grained with the behaviors you configure. You might, for example, want to apply a rule to all the .js and .css files on your origin. Perhaps you want to forward query strings to the origin for these file types. Similarly, you might want to ignore the TTL the origin is trying to set for image files, instead of telling CloudFront to cache for a minimum of 24 hours.
  • Aliases: These are additional hostnames you want the distribution to serve traffic for. For example, if your Origin Domain Name is configured to loadbalancer.example.org, then you probably want aliases that look something like this:
    • example.org
    • www.example.org

The CloudFormation template provided with this recipe expects one or more aliases to be provided in the form of a comma-delimited list of strings.

  • Allowed HTTP methods: By default, CloudFront will only forward GET and HEAD requests to your origin. This recipe doesn’t change those defaults, so we don’t declare this parameter in the template provided. If your origin is serving dynamically generated content, then you will likely want to declare this parameter and set its values to GETHEADOPTIONSPUTPOSTPATCH, and DELETE.
  • TTLs (minimum/maximum/default): Optionally, you can define how long you’d like objects to stay in CloudFront’s caches before they expire and are fetched from the origin. Again, we’ve opted to stick to CloudFront’s default values to keep this recipe simple, so we’ve omitted this parameter from our template. The defaults are as follows:
    • Minimum TTL: 0 seconds
    • Default TTL: 1 day
    • Maximum TTL: 1 year
  • Price Class: By default, CloudFront will serve your content from all of its edge locations, giving you the maximum performance possible. We’re going to deploy our distribution using the lowest possible price class, Price Class 100. This corresponds to edge locations in the United States, Canada, and Europe. Users from Australia would not benefit too much from this price class, but you’re also paying less for it. Price Class 200 adds Asian regions, and Price Class All adds South America and Australia.
A comprehensive list and detailed explanation on which values can be specified when creating a CloudFront distribution can be found at  http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/distribution-web-values-specify.html.

How to do it…

The first (and only) thing we need to do is configure a CloudFront distribution, as shown in the following diagram:

CloudFront Edge servers

Let’s see how we can do that:

  1. Create a new CloudFormation template and start with the following code, which can be found in this book’s GitHub repository. Name the file 03-02-Caching.yaml:
   AWSTemplateFormatVersion: '2010-09-09'
      Parameters:
        OriginDomainName:
          Description: The hostname of your origin
           (i.e. www.example.org.s3-website-ap-southeast-2.amazonaws.com)
          Type: String
        Aliases:
          Description: Comma delimited list of aliases
           (i.e. example.org,www.example.org)
          Type: CommaDelimitedList

  1. Continue with the resources:
      Resources:
        DistributionALogBucket:
          Type: AWS::S3::Bucket
        DistributionA:
          Type: AWS::CloudFront::Distribution    
          Properties:
            DistributionConfig:
              Origins:
              - DomainName:
                  Ref: OriginDomainName
                Id: OriginA
                CustomOriginConfig:
                  OriginProtocolPolicy: http-only
              Enabled: true
              
  1. Continue with the following code:
             Logging:
                IncludeCookies: false
                Bucket:
                  Fn::GetAtt: DistributionALogBucket.DomainName
                Prefix: cf-distribution-a
              Aliases:
                Ref: Aliases
              DefaultCacheBehavior:
                TargetOriginId: OriginA
                ForwardedValues:
                  QueryString: false
                ViewerProtocolPolicy: allow-all
              PriceClass: PriceClass_100
  1. Finally, we need to define the outputs:
Outputs:
        DistributionDomainName:
          Description: The domain name of the CloudFront Distribution
          Value:
            Fn::GetAtt: DistributionA.DomainName
        LogBucket:
          Description: Bucket where CloudFront logs will be stored
          Value:
            Ref: DistributionALogBucket
  1. Using the template we created, go ahead and create your CloudFront distribution. Expect to wait around 20-25 minutes for this stack to finish being created. It takes a while for your distribution configuration to be pushed out to all the AWS CloudFront locations:
aws cloudformation create-stack \ 
          --stack-name cloudfont-cache-1 \ 
          --template-body file://03-02-Caching.yaml \  
          --parameters \ 
          ParameterKey=OriginDomainName,ParameterValue=<your-domain-name> \ 
          ParameterKey=Aliases,ParameterValue='<alias-1>\,<alias-2>'

How it works…

Content delivery is designed to quickly and efficiently distribute content to users. The best way to do this is to leverage a Content Delivery Network (CDN). Amazon’s CDN service is Amazon CloudFront.

At the time of writing, AWS has 20 regions, and it has an additional 115 edge locations that can be used as part of CloudFront. This gives you a massive global network of resources that you can use to improve your users’ experience of your application.

CloudFront works closely with S3 to serve static assets. In addition to this, it can be configured to cache dynamic content. This gives you an easy way to improve the performance of applications that are not even aware of CloudFront.

CloudFront websites are referred to as distributionswhich describes their CDN role.

Distributions can also be used to provide a common frontend for multiple, disparate sources of content.

Comments are closed.

loading...