Skip to main content

Command Palette

Search for a command to run...

How to Optimize AWS Auto Scaling: Essential Tips

AWS Auto Scaling Groups

Published
6 min read
How to Optimize AWS Auto Scaling: Essential Tips

AWS Auto Scaling Groups

Auto Scaling Groups are a key part of both EC2 and Scalability. They are closely related to Load Balancers and are typically used together. An Auto Scaling Group contains a collection of Amazon EC2 instances treated as a logical grouping for automatic scaling and management purposes. An Auto Scaling Group lets you Scale-In and scale out. Let’s explore these concepts in the following sections:

  1. Introduction to Auto Scaling Groups

  2. Launch Templates and Launch Configurations

  3. Scaling Policies

  4. Scaling Cooldown Period

  5. Typical Exam Questions

Remember that all the chapters from the course can be found in the following link:

Introduction

Imagine we have an application that experiences a fivefold increase in users on Friday at 4 p.m., while on Monday mornings, the number of users decreases. Without an auto-scaling group, we would have to predict the number of users to launch more instances to ensure the web page runs smoothly; otherwise, the website might crash. However, thanks to the auto-scaling group, this will not be a problem, as it will automatically scale itself, adding or removing instances according to certain conditions. We could set Auto Scaling Alarms in case the CPU usage of an instance is higher than 40%, and it automatically scales and launches another instance (with the help of CloudWatch, a service we will see later).

The size of an Auto Scaling group depends on the number of instances you set as the desired capacity. You can adjust its size to meet demand, either manually or by using automatic scaling. An Auto Scaling group starts by launching enough instances to meet its desired capacity. It maintains this number of instances by performing periodic health checks on the instances in the group. You can use scaling policies to dynamically increase or decrease the number of instances in your group to meet changing conditions. When the scaling policy is in effect, the Auto Scaling group adjusts the desired capacity of the group, between the minimum and maximum capacity values you specify, and launches or terminates the instances as needed. You can also scale on a schedule.

Auto Scaling Groups Parameters to Scale

Launch Templates and Launch Configurations

To scale, we need to define launch templates or launch configurations. Both specify instance configuration information, like the AMI of the instance it will launch, Security Groups, Key Pairs, etc.

  • Launch Configuration: An instance configuration template that an Auto Scaling group uses to launch EC2 instances. It allows you to create an individual instance instead of a group of instances. Amazon strongly recommends not using it, as Launch Templates are the new version.

  • Launch Template: Allows you to have multiple versions of a template (versioning). You can create an Auto Scaling group that launches both Spot and On-Demand Instances or specifies multiple instance types or launch templates. This feature is not available with Launch Configurations.

By default, the Auto Scaling Group attempts to balance the instances across the Availability Zones. If we have three instances in AZ1 and two in AZ2, and a scale-in process is executed, it will try to terminate one of AZ1.

Scaling Policies

There are different ways to automatically scale the number of instances dynamically based on the size of the alarm breach, and depending on what we want to do, we’ll use one or another.

  • Tracking Scaling: This policy lets you specify a scaling metric and value that your auto-scaling group should always maintain. For example, you could say, “I want the average use of the CPU of all instances to be around 40%”. If the average use of the CPU is lower, it will Scale-In (remove instances); whereas if the average use of the CPU increases over 40%, it will scale out (add instances).

  • Simple Scaling: This policy relies on a metric for scaling. You could say, “If the CPU usage goes over 80%, add a new instance”. You define the conditions.

  • Step Scaling: This policy is the evolution of Simple Scaling. Step scaling applies “step adjustments”, meaning you can set multiple actions to vary the scaling depending on the size of the alarm breach. You could say, “If the CPU usage goes over 60%, add a new instance, and if it goes over 70%, add two instances”.

  • Scheduled Actions: Set up your scaling schedule according to predictable load changes. If, for example, we know that there is more traffic on Fridays at 4 p.m. on your website, this would be the right policy since we could make it scale out directly according to a schedule.

Scaling Cooldown Period

This helps to ensure that your Auto Scaling group doesn’t launch or terminate additional instances before the previous scaling activity takes effect. It ensures that the Auto Scaling Group does not do anything (e.g., launch a new instance) until the previous operation has been completed. The default value is 300 seconds.

Typical Exam Questions

  1. Question: We want to design the infrastructure to run an application on Amazon EC2 instances, which requires high availability and must dynamically scale based on demand to be cost-efficient. How can we best meet these requirements?

    • Options:

      1. Configure an Application Load Balancer before an Auto Scaling group to deploy instances to multiple Regions.

      2. Configure an Application Load Balancer before an Auto Scaling group to deploy instances to multiple Availability Zones.

      3. Configure Amazon CloudFront distribution before an Auto Scaling group to deploy instances to multiple Availability Regions.

      4. Configure Amazon CloudFront distribution before an Auto Scaling group to deploy instances to multiple Availability Zones.

    • Solution: 2. You cannot deploy instances in multiple regions with an Auto Scaling Group. Still, you can do it in Multiple Availability Zones, making the application highly available and scale based on demand.

  2. Question: An application runs on Amazon EC2 instances behind an Application Load Balancer, and its instances run in an Amazon EC2 Auto Scaling Group across multiple Availability Zones. The Auto Scaling group scales up to 20 instances during work hours but down to 2 cases at night. The application is very slow when the day begins, although it runs well by mid-morning. What should we implement to solve this problem and keep the cost to a minimum?

    • Options:

      1. Implement a scheduled action that sets the desired capacity to 20 shortly before the office opens.

      2. Implement a step scaling action triggered at a lower CPU threshold, decreasing the cooldown period.

      3. Implement a scheduled action that sets the minimum and maximum capacity to 20 before the office opens.

      4. Implement a target tracking action triggered when the CPU threshold is low and decrease the cooldown period.

    • Solution: 4. It could be a good use case for the scheduled actions, but using a target tracking action would be more expensive than target tracking actions. You don’t need 20 instances in the morning’s very first hour. With target tracking actions, you can create fewer servers and scale out in case the CPU increases. This is a tricky question because you would typically go for the third option, but the key is keeping the cost to the minimum. Using a reduced cooldown period will also more quickly terminate unneeded instances, lowering costs.