Auto Scaling Group for EC2

This blog is a part of my journey “Embarking on the AWS Solution Architect Associate SAA-CO3 Certification Journey”

Introduction

In today’s cloud-centric world, infrastructure needs can change rapidly. The ability to seamlessly scale your application resources up or down to meet demand is crucial for maintaining optimal performance and cost efficiency. This is where AWS Auto Scaling Groups come into play. In this blog, we’ll explore what Auto Scaling Groups are, how they work, and why they are a fundamental component of AWS infrastructure management.

Introductory Points

  • In practical scenarios, websites and applications frequently experience fluctuations in their load.
  • Cloud environments offer the advantage of swiftly provisioning and decommissioning servers.
  • The primary objectives of an Auto Scaling Group (ASG) encompass:
    • Scaling out by adding EC2 Instances to accommodate increased loads.
    • Scaling in by removing EC2 Instances to match decreased loads.
    • Ensuring a specified range of EC2 Instances is maintained, with minimum and maximum counts.
    • Automatically registering new instances with the load balancer.
    • Replacing terminated EC2 instances with new ones as necessary.
  • ASGs are cost-free; charges are applicable solely to the underlying resources.
  • Creating launch templates is a prerequisite for ASGs and involves:
    • Specifying the AMI and instance type.
    • Defining EC2 user data.
    • Configuring EBS volumes.
    • Assigning security groups.
    • Setting up SSH key pairs.
    • Configuring IAM roles for your EC2 Instances.
    • Providing VPC and subnet details.
    • Incorporating load balancer information.
  • It is essential to establish initial values for Min Size, Max Size, and Initial Capacity.
  • Defining scaling policies is a crucial step in the process.

How Do Auto Scaling Groups Work?

Auto Scaling Groups work based on a set of policies and triggers defined by you, the AWS user. Here’s a high-level overview of how they operate:

  • Initial Configuration: You start by defining your ASG’s launch configuration. This includes specifying the Amazon Machine Image (AMI) for your instances, instance types, security groups, and other instance details.
  • Scaling Policies: You create scaling policies that define how and when your ASG should scale. These policies can be triggered by conditions such as CPU utilization, network traffic, or custom metrics.
  • Scaling Triggers: You attach these policies to scaling triggers, which determine when the policies should be executed. For instance, you can set up a trigger to add more instances when CPU utilization exceeds 70%.
  • Scaling Actions: When a trigger condition is met, the ASG performs a scaling action. This could involve launching new instances, terminating existing ones, or keeping the fleet size unchanged.
  • Maintaining Desired Capacity: ASGs continuously monitor the health of instances and work to maintain the desired capacity. If an instance fails, the ASG can automatically replace it.
  • Load Balancing: Auto Scaling Groups can work seamlessly with Elastic Load Balancers (ELBs) to distribute incoming traffic evenly across your instances.

AutoScaling with CloudWatch Alarms

  • ASG scaling can be orchestrated in response to CloudWatch Alarms.
  • An alarm is designed to track a specific metric, be it average CPU utilization or a custom metric.
  • Metrics like average CPU utilization are calculated considering all instances within the ASG.
  • Scale-out and scale-in policies can be devised in accordance with the triggers generated by these alarms.

Scaling Policies

  • Adaptive Scaling Policies
    • Target Tracking Scaling
      • This is the simplest and most straightforward method to configure.
      • For instance, you can set a target of maintaining an average ASG CPU utilization around 40%.
    • Basic Step Scaling
      1. When a specific CloudWatch alarm triggers, such as CPU exceeding 70%, you can define a step like adding 2 units.
    • Scheduled Actions
      1. These policies are useful for scenarios where scaling is anticipated based on predictable usage patterns. For example, if you expect a surge in traffic for a Black Friday sale tomorrow, you can schedule the addition of a few more instances.
  • Predictive Scaling
    1. Predictive scaling continuously forecasts load patterns and schedules scaling actions proactively.

Effective Scaling Metrics

  • CPU Utilization
  • Request Count per Target: Ensuring a stable number of requests per EC2 instance.
  • Average Network In/Out
  • Custom Metrics: Any metric that can be pushed using CloudWatch and monitored through Cloud Alarms.

Scaling Cooldown

  • Following a scaling activity, a cooldown period is initiated.
  • During this cooldown, the ASG refrains from launching or terminating additional instances to allow metrics to stabilize.
  • Tip: Expedite your readiness by using a pre-configured AMI to reduce setup time, enabling faster response to requests and minimizing the cooldown period.

Related Post