AWS Cloud Operations Blog
Use Systems Manager Automation documents to manage instances and cut costs off-hours
AWS customers are continuously trying to curtail any unnecessary spend on their cloud infrastructure. An easy way to accomplish this is by minimizing infrastructure when it’s not under heavy use, for example by turning off Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Relational Database Service (Amazon RDS) instances for workloads outside of business hours. For workloads that cannot be turned off (due to dependencies on other systems), downsizing instance types is a good alternative.
Applying these measures has the potential to save companies up to 70% in infrastructure costs. In this post, I show you how to use AWS Systems Manager Automation documents to turn off or downsize your Amazon EC2 and Amazon RDS instances. These Automation documents can then be scheduled for known low usage periods such as nights, holidays, and weekends.
Overview of solution
I cover two solutions, which use Systems Manager Automation documents to start, stop, resize your instances. The first solution uses State Manager to schedule the executions, and the second uses Amazon CloudWatch Events.
First, let’s discuss Automation documents, since both solutions have this in common. Automation documents are playbooks which define an ordered set of actions to be executed on a resource. For example, copying an Elastic Block Store snapshot, creating an Amazon Machine Image, creating a tag, or checking an instance’s status. You can use pre-defined Automation documents (prefixed with AWS) or define your own. These can be invoked on a schedule or via a trigger. For more information, check Systems Manager automation.
For this post, we focus on five pre-defined Automation documents:
- AWS-StartEC2Instance: Start an Amazon EC2 instance.
- AWS-StopEC2Instance: Stop an Amazon EC2 instance. If this fails, issue a forced-stop.
- AWS-ResizeInstance: Check the instance type against the target instance type. If they don’t match, it stops and modifies the instance to be of the new instance type, waits five seconds, and restarts the instance.
- AWS-StartRdsInstance: Launch the Amazon RDS instance, unless the instance is already available or in starting status. If so, wait until the instance has reached available status.
- AWS-StopRdsInstance: Start the Amazon RDS instance, unless the instance is already in stopping or stopped status. If so, wait until the instance has reached stopped status.
By using one or more of these Automation documents, a team could choose to run an m5.2xlarge EC2 instance during business hours to accommodate higher traffic, and downsize to an m5.large instance in the evenings and weekends. Another team might choose to turn off their QA environment EC2 and Amazon RDS instances on the weekends, and have them turn back on Mondays at 9am.
Here is how the process works for solution 1:
1. When triggered on a schedule, State Manager invokes an Automation document on a set of EC2 instances, or Amazon RDS instances.
a) The Automation document runs its commands to:
-
- Start, stop, or resize one or more EC2 instances
- Start or stop one or more Amazon RDS instances
Here is how the process works for solution 2:
2. When triggered on a set schedule, the CloudWatch Event rule invokes the Automation document on its targets, which in this case can be an EC2 or an Amazon RDS instance.
a) The Automation document runs its commands, which can:
-
-
- Start, stop, or resize one or more EC2 instances
- Start or stop one or more Amazon RDS instances
-
Prerequisites
Create an IAM role to provide permissions to Systems Manager’s automation to manage your EC2 instances:
- Open the IAM console at https://console.thinkwithwp.com/iam/
- In the navigation pane, choose Roles, and then choose Create role.
- Under Select type of trusted entity, choose AWS service.
- Select Systems Manager.
- Under section Select your use case, choose SSM Automation service role.
- Choose Next: Permissions.
- In the Create role page, select AmazonSSMAutomationRole from the list of policies.
- Choose Next: Tags, Next: Review.
- Assign a name for your policy (for example, AllowSSMToManageEC2andRDS), choose Create Role.
If you want to allow Systems Manager to manage your Amazon RDS instances as well, you have to add an inline policy to the IAM role. Choose the role that you just created. In the Permissions tab, choose Add inline policy, select the JSON tab, and replace the JSON content with the following code. Make sure you replace both resource parameters with one or more ARNs of your databases to comply with security best practices of granting the least privilege:
Tutorial
In this section, we cover the scenario of turning off your EC2 instances outside of business hours, where regular business hours constitute a 9AM – 5PM day. I am in the US eastern time zone (EST) and UTC time is 4 hours ahead of me, so the schedule I use is 13:00 – 21:00 UTC. For the sake of simplicity, we do not check for weekends on this post.
Let’s discuss both solutions to accomplish this in detail!
Solution 1: Scheduling through State Manager
- Navigate to the Systems Manager console and choose State Manager on the sidebar menu, then choose Create an Association.
- Enter a name for the Association, for example,
DemoAppScheduleOn
. - In the Document section, select the SSM Automation document named AWS-StartEC2Instance.
- For a single EC2, choose Simple execution. For multiple EC2 instances, choose Rate control.
- If you chose Simple execution, in the Input parameters section enter the EC2 instance id or use the interactive instance picker to choose the instance to schedule.
- If instead you chose Rate control, select InstanceId in the Parameter section and use the Targets section to select multiple instances:
- Choose Parameter Values to enter a CSV of the EC2 instance ids or use the interactive instance picker to choose instances in the Input parameters
- Choose Resource Group to select a resource group in the Resource Group box. For example, you can define a resource group called StopAfterHoursRG where a list of instances to schedule is maintained. More info at AWS Resource Groups.
- Choose Tags to fill out Tag key and Tag value (optional) boxes, then choose Add. For example, you could tag instances with a tag called StopAfterHours to define which instances to schedule. More info at AWS tagging.
- In the AutomationAssumeRole box, pick the role you created on the prerequisites step.
- In the Specify schedule section, choose On Schedule and CRON schedule builder. Under Association runs, choose the last option, then choose Day, enter 13 and 00. You should now have the following: “Every Day at 13:00”. To ensure that the association doesn’t run upon creation, choose the Apply association only at the next specified cron interval check box.
- If you selected to manage multiple EC2 instances, in the Rate control section you can optionally define:
- Concurrency box: how many instances or percentage to manage at once.
- Error threshold: how many instances or how much percentage of instances must fail for the task to stop running.
- Choose Create Association.
The page should refresh and you should now see your association in the list. After a few minutes, it should have a status of success.
Now create another Association for turning off your EC2 instance by following the preceding steps. Select the SSM Automation document AWS-StopEC2Instance and specify the schedule “Every Day at 21:00”.
If you want to manage Amazon RDS instances as well, create another set of associations for turning the Amazon RDS instances on and then turning them off. Use the AWS-StartRDSInstance and AWS-StopRDSInstance Automation documents and specify the appropriate schedule to follow.
Solution 2: Scheduling through CloudWatch Events
Navigate to the CloudWatch console and under Events on the sidebar menu, choose Rules and then Create rule.
- Under Event Source, select Schedule.
- Select Cron expression and to select a schedule of “Every day at 13:00”, enter
0 13 * * ? *
. The format of the cron expression is Minutes Hours DayOfMonth Month DayOfWeek Year and the time is in UTC format. A section will unfold following with details on when the next schedule occurs. If you’d like to see example cron expressions, click on the link following the text box for documentation. - On the Targets section on the right side, choose Add target*.
- On the dropdown, select SSM Automation.
- In the Document box, choose AWS-StartEC2Instance.
- Select Constant, and then enter in the EC2 InstanceId.
- In the AutomationAssumeRole box, copy the ARN of the IAM role you created in the prerequisites step.
- In the box under Create a new role for this specific resource, enter in a role name to use for the executions, for example DemoAppScheduleOnRule_EC2_Execute.
- If you’d also like to manage an Amazon RDS database within this rule, choose Add target.
- For the Document box, choose AWS-StartRDSInstance
- Select Constant, and then enter in the Amazon RDS database id.
- In the AutomationAssumeRole box, copy the ARN of the IAM role you created on prerequisites step.
- In the box under create a new role for this specific resource, enter a role name to use for the executions, for example, DemoAppScheduleOnRule_RDS_Execute.
- Choose Configure details.
On the next page, enter name for your rule, for example DemoAppScheduleOnRule and a description. Choose Create rule.
Now create another rule for turning off your EC2, or Amazon RDS instances by following the preceding steps. Select the SSM Automation document AWS-StopEC2Instance/AWS-StopRDSInstance accordingly and specify the cron schedule 0 21 * * ? *
for an equivalent schedule of “Every day at 21:00”.
Checking Automation executions
After your automations have had a chance to run, you should start seeing the results of the executions within Systems Manager. Navigate to Systems Manager console and choose Automation on the sidebar menu. On this page, you find a list of all the Automation documents that ran, including those associated with State Manager associations and CloudWatch Events.
You are able to discern your executions in the results using the Document name box in addition to the Executed by box. State Manager executions populate the Executed by box with the ARN of the role with StateManagerService appended at the end. The CloudWatch solution populates the Executed by box with the name of the IAM role you created in Step 8 of Solution 2.
Alternative solutions
There is an alternative solution that is also capable of achieving the same results. The AWS Instance Scheduler is an AWS solution which combines AWS Lambda, Amazon DynamoDB, and CloudWatch to manage schedules. An administrator sets up infrastructure in a designated master account to manage the schedules, and development teams assign tags to control which resources follow which schedules. It is a powerful, centralized solution capable of doing cross-account scheduling and overrides of the schedule among other features. However, it requires deploying and maintaining infrastructure in a master account. As a result, troubleshooting any issues might be complex and time consuming.
Comparison of each solution
This table compares various attributes among the different solutions to help you choose the one best suited for your requirements:
Attribute | Systems Manager State Manager | Amazon CloudWatch | AWS Instance Scheduler |
Infrastructure | Localized to AWS account | Localized to AWS account | Resides in Organization master accounts, setup by administrator |
Access requirements | Systems Manager | CloudWatch, Systems Manager | Administrator requires Lambda, DynamoDB, CloudWatch access. |
EC2 options | Start/stop/resize | Start/stop/resize | Start/stop/hibernate/resize |
Amazon RDS options | Start/stop | Start/stop | Start/stop (snapshot as pre-step). |
Aurora options | Create own Automation document | Create own Automation document | Start/stop |
Scheduling | Development team managed | Development team managed | Development teams choose from list of schedules created by administrator |
Time zone | Schedule in UTC time zone | Schedule in UTC time zone | Schedule in any time zone |
Executions logging | Success or failure history in State Manager’s execution history. Systems Manager automations last execution’s date/status visible in State Manager | CloudWatch metrics for event invocations. Success or failure history in Systems Manager automations |
Success or failure messages embedded in CloudWatch (Lambda logs) |
Handling multiple resources | An Association can handle multiple EC2 instances or Amazon RDS instances. Can use AWS Resource Groups, or tags to select instances. Rate control available. |
An Amazon CloudWatch Event rule can handle multiple targets and each target can handle a single EC2, or an Amazon RDS instance. No rate control. |
Dev teams apply a schedule-tag to all resources to be managed. No rate control. |
Cleanup process
To delete the resources associated with solution 1, go to State Manager, choose the Association you created, and then select Delete.
To delete the resources associated with solution 2, go to Amazon CloudWatch, choose Rules on the left-hand menu. Select the rules you created, choose the Actions dropdown, and then select Delete.
To delete the IAM roles go to IAM, choose Roles on the left navigation menu, select the roles you created, and then select Delete role.
Conclusion
In this post, I showed you how to minimize your cloud spend by presenting two solutions for starting, stopping, and resizing EC2, and Amazon RDS instances based on a schedule. I also covered an alternative AWS solution and compared all three solutions to facilitate deciding between them.
Systems Manager and CloudWatch are available in all major AWS Regions. For more information, check Systems Manager Automation, State Manager Associations, and CloudWatch event rules by schedule.
About the Author
Melina Schweizer is a Senior Technical Account Manager at AWS. She works with Enterprise Support customers to help them design solutions and optimize their usage of AWS resources. In her spare time, Melina enjoys playing the piano, gardening, and vacationing in Europe with her family.