Auto Scaling overview

Auto Scaling allows you to scale your Amazon EC2 capacity up or down automatically, according to conditions you define. With Auto Scaling, you can ensure that the number of Amazon EC2 Instances you are using increases seamlessly during demand spikes to maintain performance, and decreases automatically during demand lulls to minimize costs. Auto scaling is particularly well suited for applications that experience hourly, daily, or weekly variability in usage.

With Auto Scaling, you can plan to configure your Auto Scaling Group to automatically scale or maintain your application. You can configure three types of plans:

Maintain a Fixed Number of Running EC2 Instances

Use this scaling plan if you would like Auto Scaling to maintain the minimum number of Instances in your Auto Scaling group at all times. You can manually change the number of running Instances in your Auto Scaling group at any time.

Scale Based on Demand

Use this scaling plan to scale dynamically in response to changes in the demand for your application. When you scale based on demand, you must specify when and how to scale using CloudWatch metrics, such as CPU or Network usage, or metrics related to Simple Queue Service.

Scale Based on a Schedule

Use this scaling plan if you want to scale your application on a pre-defined schedule. You can specify the schedule for scaling one time only, or provide details for scaling on a recurring schedule.

Using Elastic Cloud Gate Auto Scaling Wizard, you can configure any of these options.

This section walks you through each tab of our wizard, explains each option, and explains what dependencies are between options.

Launch Configuration

Configuration Name [required]: The name of the launch configuration

Image ID [required]: Unique ID of the Amazon Machine Image (AMI) you want to use to launch your EC2 Instances inside Auto Scaling

Instance Type [required]: The type of the Instance launch inside Auto Scaling

Instance Name [optional]: When specified, each new Instance launch in Auto Scaling is automatically assigned a tag name

Scaling Group

Group Name [required]: The name of the scaling group

VPC Subnetwork [required/optional]: List of the VPC subnetwork you prefer auto scaling to launch Instances. Either subnetwork(s) or availability zone(s) need to be specified.

Availability Zones [required/optional]: List of availability zones you prefer auto scaling to launch Instances.

Security Group [optional]: Security group associated with Instances in Auto Scaling group.

Health Check Type [required]: The type of the service used to check the health of the Instances. The allowed values can be either EC2 or ELB. When ELB is selected, you must specify the name of the Elastic Load Balancer which checks the health of Instances. Default value is EC2.

Elastic Load Balancer [optional]: The name of the Elastic Load Balancer. This property is required when Health Check Type is set to ELB.

Minimum Size [required]: The minimum count of Instances to run inside the Auto Scaling group. This value can be overwritten by Desire Capacity (see below).

Maximum Size [required]: The maximum count of Instances created inside the Auto Scaling group. This is properly applicable when you use schedule based on demand.

Desired Capacity [optional]: The number of Amazon EC2 Instances running in the group. The desired capacity must be greater than or equal to the minimum size and less than or equal to the maximum size specified for the Auto Scaling group. When specified, it is the total number of Instances launched right after creating Auto Scaling group.

Health Check/Grace Period [optional]: Length of time (in seconds) after new Amazon EC2 Instance comes into service that Auto Scaling starts checking its health. During this time, any health check failure for that Instance is ignored. Default value is 300 seconds.

Cooldown [optional]: The amount of time (in seconds) between a successful scaling activity and the succeeding scaling activity. Default value is 300 seconds.

Scaling Policy

When Auto Scaling is used to scale on demand, you must define how to scale in response to changing conditions. For example, you have a web application that currently runs on two Instances: you want to launch two additional Instances when the load on the running Instances reaches 70 percent, and you want to terminate the additional Instances when the load goes down to 40 percent. You can configure your Auto Scaling group to automatically scale up and then scale down by specifying these conditions.

An Auto Scaling group uses a combination of policies and alarms to determine when the specified conditions for launching and terminating Instances are met. An alarm is an object that watches over a single metric (for example, the average CPU utilization of your EC2 Instances in an Auto Scaling group, or length of the queue in SQS) over a time period you specify. When the value of the metric breaches the thresholds you define, over a number of time periods you specify, the alarm performs one or more actions. An action can be sending messages to Auto Scaling. A policy is a set of instructions for Auto Scaling that tells the service how to respond to alarm messages.

Along with creating a launch configuration and Auto Scaling group, you need to create the alarms and the scaling policies and associate them with your Auto Scaling group. When the alarm sends the message, Auto Scaling executes the associated policy on your Auto Scaling group to scale the group in (that is, to terminate Instances) or scale the group out (that is, to launch Instances).

Adjustment Type [required]: Indicates whether the ScalingAdjustment is an absolute value, a constant increment, or a percentage of the current capacity.

ChangeInCapacity:

Use this to increase or decrease existing capacity. For example, the current capacity of your Auto Scaling group is set to three Instances: you then create a scaling policy on your Auto Scaling group, specify the type as ChangeInCapacity, and the adjustment as five. When the policy is executed, Auto Scaling adds five more Instances to your Auto Scaling group. You'll then have eight running Instances in your Auto Scaling group: current capacity (3) plus ChangeInCapacity (5) equals (8).

ExactCapacity:

Use this to change the current capacity of your Auto Scaling group to the exact value specified. For example, the capacity of your Auto Scaling group is set to five Instances. You then create a scaling policy on your Auto Scaling group, specify the type as ExactCapacity, and the adjustment as three. When the policy is executed, your Auto Scaling group has three running Instances. You'll get an error if you specify a negative adjustment value for the ExactCapacity adjustment type.

PercentChangeInCapacity:

Use this to increase or decrease the desired capacity by a percentage of the desired capacity. For example, the desired capacity of your Auto Scaling group is set to ten Instances. You then create a scaling policy on your Auto Scaling group, specify the type as PercentChangeInCapacity, and the adjustment as ten. When the policy is executed, your Auto Scaling group has eleven running Instances, because 10 percent of 10 Instances is 1 Instance, and 1 Instance plus 10 Instances equals 11 Instances.

Scaling Adjustment [required]: The number of Instances by which to scale. AdjustmentType determines the interpretation of this number (e.g. as an absolute number or as a percentage of the existing Auto Scaling group size). A positive increment adds to the current capacity and a negative value removes from the current capacity.

Cooldown [optional]: The amount of time, in seconds, after a scaling activity completes before the next scaling activity begins. Default value 300 seconds.

Minimum Adjustment Step [optional]: Used with AdjustmentType with the value. PercentChangeInCapacity, the scaling policy changes the DesiredCapacity of the Auto Scaling group by at least the number of Instances specified in the value.

Metrics

Metric Namespace [required]:

Select EC2 to scale based on EC2 metrics, like CPU or Network usage.

Select SQS to scale based on the Simple Queue Service metrics, like number of sent Messages or size of sent message.

Metric [required]: Metric used for auto scaling.

Dimension [optional/required]: Active only when the SQS namespace is selected. It allows you to select a queue name to which the metric applies.

Alarm Name [required]: The name of the alarm.

Statistic [required]: The statistic to apply to the alarm's associated metric.

Period [required]: The period in seconds over which the specified statistic is applied. The value must be a multiply of 60. The total (Period * Evaluation Period) cannot be greater than 86400. In our wizard we use minutes instead of seconds, so the maximum combined value cannot be greater than 1440.

Threshold [required]: The value against which the specified statistic is compared.

Evaluation Period [required]: The number of periods over which data is compared to the specified threshold. The total (Period * Evaluation Period) cannot be greater than 86400.

Comparison Operator [required]: The arithmetic operation to use when comparing the specified Statistic and Threshold. The specified Statistic value is used as the first operator.

Scaling Schedule

Scaling based on a schedule allows you to scale your application in response to predictable load changes. For example, every week the traffic to your web application starts to increase on Wednesday, remains high on Thursday, and starts to decrease on Friday. You can plan your scaling activities based on the predictable traffic patterns of your web application.

To configure your Auto Scaling group to scale based on a schedule, you need to create scheduled actions. A scheduled action tells Auto Scaling to perform a scaling action at a certain time in the future. To create a scheduled scaling action, you specify the start time for the scaling action to take effect, and you specify the new minimum, maximum, and desired size you want for that group at that time. At the specified time, Auto Scaling updates the group to set the new values for minimum, maximum, and desired sizes, as specified by your scaling action. In addition, instead of using start time you can use a recurrent schedule that changes Auto Scaling options on the regular bases. E.g. assuming that the traffic to your website decreases on weekends, you can schedule two recurrent occurrences where first increases capacity on each Monday, and second decreases capacity on each Friday.

Start Time [required/optional]: The time when the scaling action occurs.

End Time [required/optional]: The time when the scaling action ends. This value applies only when Recurrence is also set, otherwise it is skipped.

Recurrence [required/optional]: The time when recurring future actions start. When Start Time and End Time are specified with Recurrence, they form the boundaries of when the recurring action starts and stops.

Minimum Size [required]: The minimum count of Instances running inside the Auto Scaling group. This value can be overwritten** by Desire Capacity (see below).

Maximum Size [required]:The maximum count of Instances that might be created inside the Auto Scaling group.

Desired Capacity [optional]: The number of Amazon EC2 Instances running in the group. The desired capacity must be greater than or equal to the minimum size and less than or equal to the maximum size specified for the Auto Scaling group. When specified, that is the total number of Instances launched right after creating Auto Scaling group.

Notification

Topic Name [required]: Name of the topic.

Notification Type [required]: A list of Auto Scaling notification types, which are events that cause the notification to be sent. The following table lists the available notification types:

Notification Type Events:

EC2_INSTANCE_LAUNCH Successful Instance launch by Auto Scaling EC2_INSTANCE_LAUNCH_ERROR Failed Instance launch by Auto Scaling EC2_INSTANCE_TERMINATE Successful Instance termination by Auto Scaling EC2_INSTANCE_TERMINATE_ERROR Failed Instance termination by Auto Scaling

Delivery Type [required]:

EMAIL – Notification is delivered via email

SMS – Notification is delivered via SMS (SMS delivery is available only in the US East Region)

Subscriber [required]: Either email address or phone number (for SMS delivery) where the message is sent; the SMS can only be delivered to US phone numbers

Finish

On the last tab of our wizard is the summary of your configuration of Auto Scaling.

From this point on, you can either create auto scaling or generate CloudFormation script.

Keep in mind that CloudFormation has a couple of limitations:

CloudFormation is not supporting schedule.

In scaling policy, CloudFormation is not supporting Minimum Adjustment Step.