AWS Cloud Operations Blog
Using AWS Systems Manager OpsCenter and AWS Config for compliance monitoring
In this post, I show how AWS Systems Manager OpsCenter can be used to centrally record and mitigate alerts from AWS Config. When AWS Config detects a resource that is out of compliance, an OpsItem is created. This OpsItem is used to track details of the noncompliant resource, record investigative actions, and provide access to consistent remediation actions. It also provides a nonmutable record of all actions that can be used for audit purposes.
With OpsCenter, you have a central location to view current issues, a historical record of issues, and a list of actions taken to remedy the issue. This is useful when you have policies and guardrails that you want multiple teams to follow and you need automation to scale the process.
What is AWS Config?
AWS Config is an AWS service that is used to assess, audit, and evaluate the configurations of your AWS resources. One of its popular features is the ability to define rules that continuously scan your AWS resources for compliance to your internal guidelines. When a noncompliant resource is detected, an alert is sent, typically to an Amazon Simple Notification Service (Amazon SNS) topic.
AWS Config allows you to remediate noncompliant resources using AWS Systems Manager Automation documents. These documents define the actions to be performed on noncompliant resources evaluated by AWS Config rules. AWS Config provides a set of managed automation documents or you can create your own document to meet operational requirements. To apply remediation on noncompliant resources, you choose the remediation action from a prepopulated list or select your own document. You can do this using the AWS Config console or AWS Config API operations.
Using an Automation document allows for consistent remediation of events. Take, for instance, the case of public S3 buckets. You can configure an AWS Config rule to scan for buckets that allow public reads. When it finds a bucket, it can invoke an Automation document to disable public read and raise an alert through Amazon SNS.
However, there are some scenarios where you might want to manually investigate a noncompliant resource. Imagine a scenario where you build and maintain a collection of Amazon Machine Images (AMI). Because these AMIs have been validated, you want only these AMIs to be used for Amazon Elastic Compute Cloud instances. If someone creates an instance using an unauthorized AMI, you want to stop or shut down the instance and have the owner correct the issue.
In this scenario, you might want to investigate the situation before remediation (for example, if the server is part of an Auto Scaling group and someone has incorrectly configured the launch configuration). Auto-termination would start the process of adding a new instance that uses unauthorized AMI. This new instance would be shut down automatically, the Auto Scaling group would provision a new resource, the auto-remediation would then shut down the instance, and a loop would be created.
In this scenario, manual intervention would identify the situation and prevent the loop from occurring. You might also have a production server outside an Auto Scaling group and auto-terminating the instance would cause problems with end users.
This post describes how to create an AWS Config rule to identify EC2 instances that are using unauthorized AMIs. I also show how OpsCenter can be used to track, investigate, and remediate the issue.
Creating an AWS Config rule
To get started, open the AWS Config console. If this is your first-time using AWS Config in your Region, choose the Get Started button. If you have already used AWS Config, from the left pane, choose Rules, and then choose Add rule.
Figure 1: Rules page of the AWS Config console
AWS Config creates and maintains a set of rules that meet common scenarios and risks that customers experience on a regular basis. If you have a specific scenario that is not covered by an AWS managed rule, you can create your own custom rule.
Select rule type
In the AWS Config console, on Specify rule type, select Add AWS managed rule. In the search box, enter approved, select approved-amis-by-id, and then choose Next.
Figure 2: Add rule type page of the AWS Config console.
Customize your rule
On the next page, you can enter a name and description for the rule.
Figure 3: The Customize rule section with Name and Description boxes
In the Trigger section, specify details for the rule, including trigger type, scope of changes, and resources. Because this rule applies to EC2 instances only, I’ve specified that the rule only checks EC2 instances. In addition, the rule runs only when the EC2 configuration changes.
Figure 4: Trigger section of the AWS Config console.
Next, enter the list of authorized AMI IDs. You can enter multiple AMIs by using a comma-separated list.
Figure 5: Parameters section of the AWS Config console
Finally, leave Auto remediation to No as remediation is done through the OpsItem.
Figure 6: Remediation action for AWS Config rule
Viewing a list of rules
You can find that the rule has been created. After a few minutes, the console displays its compliance status.
Figure 7: Rules page of the AWS Config console.
Create an Amazon EventBridge rule
The next step is to use Amazon EventBridge to monitor AWS Config for noncompliant resources. Amazon EventBridge is a successor to Amazon CloudWatch Events and provides a near-real time system events stream from many AWS services and SaaS applications. You create an Amazon EventBridge rule that connects to a specified source system and receives an event. The event is transformed before delivery to the target system.
Amazon EventBridge provisions all required resources for communication, filters and transforms the event, and provides at-least-once delivery to the target system. It typically takes half a second for Amazon EventBridge to receive an event and transmit it to the target system.
I show you how to create an Amazon EventBridge rule that receives events from AWS Config, transforms them to OpsItems, and transmits the events to OpsCenter as the destination target. This sounds complex, but is simple to configure. The first step is to open the Amazon EventBridge console. Choose Create Rule, and then enter a rule name and description.
Figure 8: Create rule page in the Amazon EventBridge console.
Specify event pattern
In Define Pattern, select Pre-defined pattern by service. Under Service Provider, choose AWS.
Figure 9: Define pattern page in the Amazon EventBridge console.
Filter incoming events
AWS Config creates a large number of events and without specifying filters, you can quickly be overwhelmed by a large number of OpsItems. The console allows you to filter by message type, rule name, resource type, and specific resource. As you choose these options, you find the console builds an event pattern that is used to filter the incoming events.
Figure 10 shows the options for processing a rule. Under Event type, Config Rules Compliance Change is selected for a rule named approved-amis-by-id. Any resource type and Any resource ID are selected. (The resource, in this case, is an EC2 instance.)
Figure 10: Detailed filtering for an Amazon EventBridge rule
Build a custom filter
If you manually build the event pattern, then you can do interesting filtering that isn’t possible with the default options available in the console. The following example only accepts events from the AWS Config rule named approved-amis-by-id where the EC2 instance is NON_COMPLIANT. This creates OpsItems for noncompliant resources and filter out events for compliant resources.
{
"source": [
"aws.config"
],
"detail-type": [
"Config Rules Compliance Change"
],
"detail": {
"configRuleName": [
"approved-amis-by-id"
],
"newEvaluationResult": {
"complianceType": [
"NON_COMPLIANT"
]
}
}
}
Create an input transformer for the OpsItem target
The Amazon EventBus transforms the incoming event into an outgoing event using default mappings. However, in some situations you may want to create your own mapping using an input transformer.
By using the input transformer, you can take advantage of the deduplication logic of the CreateOpsItem API. If you specify a deduplication string, then the built-in logic creates and stores a hash based on the deduplication string and resource that triggered the OpsItem. If a matching hash is found, then a new OpsItem is not created. By enabling this feature, you can only have a single OpsItem for each noncompliant instance.
Under Select event bus, choose AWS default event bus. Under Select target, choose SSM OpsItem. Select the Create a new role for this specific resource option.
Figure 11: Selecting targets for the Amazon EventBridge rule
If you expand the area under Target, you can use the boxes under Input transformer to specify the input path and the input template.
Figure 12: Select targets page where you enter the input path and input template
How to transform the event
Input path is where you reference the elements from the original event. The input template is where you specify the elements of the new event. In this case, the input path references elements of the Config element and the input template includes elements for the OpsItem. If you are interested in extending these examples, you can read more about how to transform target input.
The following text is used for the input path:
{
"detail":"$.detail",
"resources":"$.resources",
"resourceType": "$.detail.resourceType",
"resourceId": "$.detail.resourceId",
"configRuleName": "$.detail.configRuleName",
"complianceType": "$.detail.newEvaluationResult.complianceType"
}
The following text is used for the input template. It creates an OpsItem with a priority of 2 and severity of 1. Here are some other elements to note:
- The OperationalData element adds some interesting elements to the OpsItem.
- The /aws/automations element allows you to associate pre-existing runbooks with this OpsItem.
- The dedupe process creates a hash using the /aws/dedup string and the resourceId, which is the EC2 instance ID.
- The other elements of OperationalData are used to provide context information for this rule.
{
"title":"EC2 Instance is running an unauthorized AMI",
"description":"An AWS Config rule detected that an EC2 instance is running an unauthorized AMI.",
"source":"Config Compliance",
"category":"Availability",
"priority":"2",
"severity":"1",
"resources": <resources>,
"detail": <detail>,
"operationalData":{
"/aws/automations":{
"value":"[ { \"automationType\": \"AWS:SSM:Automation\", \"automationId\": \"AWS-TerminateEC2Instance\" }, { \"automationType\": \"AWS:SSM:Automation\", \"automationId\": \"AWS-StopEC2Instance\" } ]"
},
"/aws/dedup":{
"type":"SearchableString",
"value":"{\"dedupString\":\"Config-Compliance-EC2-Authorized-AMI\"}"
},
"complianceType": {"type": "SearchableString", "value": <complianceType>},
"configRuleName": {"type": "SearchableString","value": <configRuleName>},
"resourceType": {"type": "SearchableString","value": <resourceType>},
"resourceId": {"type": "SearchableString","value": <resourceId>}
}
}
Create an Amazon EventBridge rule
Under Input transformer, paste the values into the box and create the rule.
Figure 13: Detailed transformation for the Amazon EventBridge rule
Congratulations!
You have set up AWS Config to scan for EC2 instances that are using unauthorized AMIs. When AWS Config detects a noncompliant instance, it creates an event that is received by Amazon EventBridge. An Amazon EventBridge rule creates and sends an OpsItem to OpsCenter. The system dedupes the events, which result in you having a single open OpsItem for each noncompliant instance.
Using OpsCenter in AWS Systems Manager
OpsCenter in AWS Systems Manager helps you to view, investigate, and resolve operational issues related to your AWS and hybrid cloud deployments. OpsCenter uses OpsItems to present operational issues in a standardized view. The OpsItems provide operational details and contextual information to help you quickly diagnose and remediate the source issue.
Figure 14: OpsItems by source and age section of the AWS Systems Manager console
Viewing OpsItem details
On Figure 15, you find the OpsItem that is created for noncompliant EC2 instances. The Overview tab displays the OpsItem details. It includes elements in the input template. Note the values displayed for Deduplication string, Priority, and Severity. It is not shown on the console, but OpsCenter uses the Deduplication string plus the EC2 instance ID to create the dedupe hash.
Figure 15: View details of OpsItem for noncompliant EC2 instance
The Related resources section displays details about the AWS resource that triggered the AWS Config rule. As expected, there is an EC2 instance. The second entry under Resource ARN is the AWS Config rule. You can choose the resource ARN to view details such as Amazon CloudWatch metrics. You can also navigate to the Amazon EC2 console to view instance details.
Figure 16: Related resources section of the console
Scrolling down, you can find a list of Automation runbooks that allow you to take consistent actions on AWS resources. You can find the runbooks that were specified in the input template. You can use runbooks created by AWS or create your own to meet your operational requirements.
Figure 17: Runbooks section of the console
The Operational data section provides context information on the item and includes elements that were specified in the input transformer. Under complianceType, you can find that the EC2 instance is NON_COMPLIANT.
Figure 18: Operational detail added to OpsItem by Amazon EventBridge transformer
Investigate an issue
To investigate, choose the Related resource link for the instance and open Resource description. You can find that an unauthorized AMI is being used and that the instance is running.
Figure 19: Details for the noncompliant EC2 instance
Remediating an issue
Now that you’ve confirmed the noncompliance of the instance and viewed other relevant details, you are ready to act. The action is to use a predefined runbook to turn off the instance. Using automation helps avoid manual errors and inconsistencies in remediation efforts.
Return to the Overview section where you can find a predefined runbook named AWS-StopEC2Instance. Select the runbook, and then choose Execute.
Figure 20: Runbooks page of the console
To check the progress of the runbook, choose the latest result. In addition to execution details, there is also a Save to operational data button. This provides an audit trail of remediation activity.
Figure 21: Latest automation results for AWS-StopEC2Instance
Confirming the issue has been remediated
After the execution of the runbook is complete, open Related Resource. Under State, you can find that stopped is displayed.
Figure 22: Resource description details
Conclusion
You have learned how to detect noncompliant resources in your AWS environment and now have a process for consistently investigating and remediating risks. I showed you how to create an AWS Config rule to identify EC2 instances that are using unauthorized AMIs. I also showed you how to create an Amazon EventBridge rule that receives events from AWS Config. These are transformed to OpsItems, and transmits the events to OpsCenter as the destination target.
For more information about AWS Config, check AWS Config best practices. For more information about using Automation documents for operational tasks, check creating Automation documents that run scripts in the AWS Systems Manager documentation.
About the author
Michael Heyd is a Solutions Architect with Amazon Web Services and is based in Vancouver, Canada. Michael works with enterprise AWS customers to transform their business through innovative use of cloud technologies. Outside work he enjoys board games and biking.