loading...

AWS – Autoscaling an application server

Initial Server Setup with CentOS 8

Autoscaling is a fundamental component of computing in the cloud. To understand autoscaling, you need to understand the concepts of vertical and horizontal scaling. With vertical scaling, a single machine is upgraded to a more powerful instance by adding more CPU power, more RAM, or more disk capacity. This can be effective to an extent, but eventually, the complexity and costs associated with vertical scaling make it impractical. With horizontal scaling, an application workload is spread out over several smaller machines, and adding new machines provides a nearly linear increase in the load that can be managed by the application. Adding extra machines is called scaling up, and removing machines that are no longer needed is called scaling down.

EC2 autoscaling provides not only the ability to scale up and down in response to application load but also redundancy. It does this by ensuring that capacity is always available. Even in the unlikely event of an AZ outage, the autoscaling group will ensure that instances are available to run your application if you have configured it to provision instances in all AZs.

Autoscaling also allows you to pay for only the EC2 capacity you need because underutilized servers can be automatically deprovisioned.

Getting ready

You must supply two or more subnet IDs for this recipe to work.

The following example uses an AWS Linux AMI in the us-east-1 region. Update the parameters as required if you are working in a different region.

How to do it…

Follow these steps to create a CloudFormation template that launches a stack with an autoscaling group:

  1. Start by defining the template version and description:
AWSTemplateFormatVersion: "2010-09-09"
Description: Create an Auto Scaling Group
  1. Add a Parameters section with the required parameters that will be used later in the template:
Parameters:
  SubnetIds:
    Description: Subnet IDs where instances can be launched
    Type: List<AWS::EC2::Subnet::Id>
  1. Still under the Parameters section, add the optional instance configuration parameters:
  AmiId: 
    Description: The application server's AMI ID 
    Type: AWS::EC2::Image::Id 
    Default: ami-9be6f38c # AWS Linux in us-east-1 
  InstanceType: 
    Description: The type of instance to launch 
    Type: String 
    Default: t2.micro   
  1. Still under the Parameters section, add the minimum and maximum sizes:
  MinSize: 
    Description: Minimum number of instances in the group 
    Type: Number 
    Default: 1
  MaxSize: 
    Description: Maximum number of instances in the group 
    Type: Number 
    Default: 4 
  1. Then, add the settings for the CPU thresholds:
  ThresholdCPUHigh: 
    Description: Launch new instances when CPU utilization 
      is over this threshold 
    Type: Number 
    Default: 60 
  ThresholdCPULow: 
    Description: Remove instances when CPU utilization
      is under this threshold 
    Type: Number 
    Default: 40 
  ThresholdMinutes: 
    Description: Launch new instances when over the CPU 
      threshold for this many minutes 
    Type: Number 
    Default: 5
  1. Add a Resources section and define the autoscaling group resource:
Resources: 
  AutoScalingGroup: 
    Type: AWS::AutoScaling::AutoScalingGroup 
    Properties: 
      MinSize: !Ref MinSize 
      MaxSize: !Ref MaxSize 
      LaunchConfigurationName: !Ref LaunchConfiguration 
      Tags: 
        - Key: Name 
          Value: !Sub "${AWS::StackName} server" 
          PropagateAtLaunch: true 
      VPCZoneIdentifier: !Ref SubnetIds
  1. Still under the Resources section, define the launch configuration that’s used by the autoscaling group:
  LaunchConfiguration: 
    Type: AWS::AutoScaling::LaunchConfiguration 
    Properties: 
      ImageId: !Ref AmiId 
      InstanceType: !Ref InstanceType 
      UserData: 
        Fn::Base64: !Sub | 
          #!/bin/bash -xe 
          # This will be run on startup, launch your application here
  1. Next, define two scaling policy resources – one to scale up and the other to scale down:
 ScaleUpPolicy: 
    Type: AWS::AutoScaling::ScalingPolicy 
    Properties: 
      AdjustmentType: ChangeInCapacity 
      AutoScalingGroupName: !Ref AutoScalingGroup 
      Cooldown: 60 
      ScalingAdjustment: 1 
  ScaleDownPolicy: 
    Type: AWS::AutoScaling::ScalingPolicy 
    Properties: 
      AdjustmentType: ChangeInCapacity 
      AutoScalingGroupName: !Ref AutoScalingGroup 
      Cooldown: 60 
      ScalingAdjustment: -1
  1. Define an alarm that will alert you when the CPU goes over the ThresholdCPUHigh parameter:
  CPUHighAlarm: 
    Type: AWS::CloudWatch::Alarm 
    Properties: 
      ActionsEnabled: true 
      AlarmActions: 
        - !Ref ScaleUpPolicy 
      AlarmDescription: Scale up on CPU load 
      ComparisonOperator: GreaterThanThreshold 
      Dimensions: 
        - Name: AutoScalingGroupName 
          Value: !Ref AutoScalingGroup 
      EvaluationPeriods: !Ref ThresholdMinutes 
      MetricName: CPUUtilization 
      Namespace: AWS/EC2 
      Period: 60 
      Statistic: Average 
      Threshold: !Ref ThresholdCPUHigh
  1. Finally, define an alarm that will alert you when the CPU goes under the ThresholdCPULow parameter:
  CPULowAlarm: 
    Type: AWS::CloudWatch::Alarm 
    Properties: 
      ActionsEnabled: true 
      AlarmActions: 
        - !Ref ScaleDownPolicy 
      AlarmDescription: Scale down on CPU load 
      ComparisonOperator: LessThanThreshold 
      Dimensions: 
        - Name: AutoScalingGroupName 
          Value: !Ref AutoScalingGroup 
      EvaluationPeriods: !Ref ThresholdMinutes 
      MetricName: CPUUtilization 
      Namespace: AWS/EC2 
      Period: 60 
      Statistic: Average 
      Threshold: !Ref ThresholdCPULow
  1. Save the template with the filename 04-01-AutoScaling.yml.
  1. Launch the template with the following AWS CLI command, supplying your subnet IDs in place of <subnet-id-1> and <subnet-id-2>:
 aws cloudformation create-stack \
 --stack-name asg \
 --template-body file://04-01-AutoScaling.yml \
 --parameters \
 ParameterKey=SubnetIds,ParameterValue='<subnet-id-1>\,<subnet-id-2>' 
  1. At this point, the CFN service is provisioning all the resources in the template and will take a few minutes to complete. Once the stack has reached a CREATE_COMPLETE status, you can confirm that the autoscaling group is working correctly by checking for a new EC2 instance with the name asg-server.
  2. Delete the stack to prevent future charges for the resources that were created in this recipe.

How it works…

This example defines an autoscaling group and dependent resources. These include the following:

  • A launch configuration to use when launching new instances. Launch configurations are templates that describe the settings for newly launched instances.
  • Two scaling policies: one to scale the number of instances up, and an inverse policy to scale back down.
  • A CloudWatch alarm to alert when the CPU crosses a certain threshold for a certain number of minutes. CloudWatch is the AWS logging and monitoring solution and is covered in more detail in Chapter 5, Monitoring the Infrastructure.

The autoscaling group and launch configuration resource objects in this example use mostly default values. You will need to specify your own SecurityGroups and a KeyName parameter in the LaunchConfiguration resource configuration if you want to be able to connect to the instances (for example, via SSH).

AWS will automatically take care of spreading your instances evenly over the subnets you have configured, so make sure that they are in different AZs! When scaling down, the oldest instances will be removed before the newer ones.

Scaling policies

The scaling policies detail how many instances to create or delete when they are triggered. It also defines a Cooldown value, which helps prevent flapping servers – when servers are created and deleted before they have finished starting and are useful. Often, an EC2 instance will have startup scripts that install third-party packages and custom application software that takes several minutes to complete. The scaling policies allow you to fine-tune the timing to avoid putting a machine into rotation before it is ready.

While the scaling policies in this example use equal values, you might want to change that so that your application can scale up quickly and scale down slowly for the best user experience.

Alarms

The CPUHighAlarm parameter will alert you when the average CPU utilization goes over the value set in the ThresholdCPUHigh parameter. CPU utilization and other instance metrics are tracked by CloudWatch. This alert will be sent to the ScaleUpPolicy resource that’s provisioning more instances, which will bring the average CPU utilization down across the whole autoscaling group. As the name suggests, the CPULowAlarm parameter does the opposite when the average CPU utilization goes under the ThresholdCPULow parameter.

This means that new instances will be launched until the CPU utilization across the autoscaling group stabilizes somewhere between 40-60% (based on the default parameter values) or the MaxSize of instances is reached.

It is very important to leave a gap between the high and low alarms thresholds. If they are too close together, the alarms will not stabilize and you will see instances being created and destroyed almost continually.

Comments are closed.

loading...