Networking & Content Delivery
Automated VPC prefix list population for cross-Region and in-Region security group referencing
AWS customers regularly use the ability to reference another security group in the same Amazon Virtual Private Cloud (VPC), or a peered VPC in the same Region, as a dynamic reference. This ability allows customers who have highly ephemeral workloads to adopt the practice of least privilege more easily. We do not currently support security group referencing with cross-Region VPC peering, or when traversing an AWS Transit Gateway (TGW). If you use more specific routing in a VPC to inspect traffic between subnets using AWS Gateway Load Balancer or AWS Network Firewall, you cannot use security group referencing.
In 2020, AWS released VPC prefix lists, which allow customers to create a list of CIDR blocks that can be referenced in security groups, VPC route tables, and TGW route tables. In this blog, we will show how to use AWS Lambda to automate the synchronization of private IP addresses for Elastic Network Interfaces (ENIs) associated with a security group with a prefix list. This will help customers use prefix lists to reference IP addresses for ENIs in a security group that cannot be referenced directly because of these reasons. We released this code on GitHub with an open source license. (If you would like to skip ahead, you can deploy the solution using the AWS CloudFormation template found in our GitHub repository.)
Alternatively, you could use subnet CIDRs or subnet CIDR reservation ranges in your prefix list. With subnet CIDR reservations, you set aside a range of IPv4 or IPv6 addresses that AWS will not automatically assign when creating new ENIs. When setting aside the range as an explicit reservation, you must assign IP addresses manually to ENIs to consume IP addresses from this range. You can then add these CIDRs to a prefix list manually and reference the prefix list in multiple security groups that you cannot reference the originating security group directly. By using the prefix list, should you need to add or remove CIDRs in the future, you can do so in the single prefix list instead of updating each security group that references it.
Overview of solution
Our solution allows you to specify a security group and AWS Region. It creates a prefix list in the specified AWS Region. Then, it creates a mapping that starts automatically synchronizing the private IPs on ENIs corresponding to the specified security group with the prefix list created. Our solution automatically scales the max entries’ value using the recently released prefix list resize feature. The prefix list can be created in the same or a different Region than the security group, but it must be created in the same account. If you wish to use this across accounts, you can use AWS Resource Access Manager (RAM) to share the prefix list(s) with other accounts.
Our solution starts by creating an S3 bucket and copying the Lambda deployment packages to it. Then, it creates three Lambda Functions that will be used for the solution. It also creates an SNS topic, and subscribes the email address specified in the CloudFormation parameters to send a notification if any critical errors or warnings occur during synchronization. Finally, it creates an Amazon EventBridge Scheduled Rule to initiate the solution to do batch synchronizations on a regular interval. We created the scheduled rule in a disabled state to prevent charges from occurring for invocations of the Lambda functions prior to configuring any security groups for sync.
Walkthrough
Let’s start off by walking through automated deployment. After that, we will get into the nitty-gritty of how the solution works as shown in figure 1.
Launching the Stack
- Start by navigating to CloudFormation in the AWS Console in the account and Region where your security group resides
- Click “Create Stack” and paste the S3 URL for the AutoSG2PL.yaml template
- Give your stack a name in the “Stack Name” Field
- Fill in or change the parameters as desired
- PercentSafeThresholdSGQuota—this is a whole number representation of the percentage difference between the prefix list size and max rules per security group quota. If you exceed this quota, we send a notification using the SNS topic that was created. We also use this when calculating and resizing the prefix list.
- BaseSafeThresholdSGQuota—this is a base number of the difference between prefix list size and max rules per security group quota. For large environments, we can leave this number low, but for smaller environments where a percentage may not be a significant enough buffer, this helps provide additional buffering.
- SNSTopicName—this is the display name for the SNS topic created. Any emails sent with warnings are shown as from this name.
- <Required> SNSRecipientEmail—this is the email address that was subscribed to the SNS topic to receive warning messages when a synchronization job fails. If you wish to subscribe more than one, subscribe additional later in the SNS Console or via the AWS CLI.
- LogLevel—this is the initial log level set on each of the Lambda Functions. You can update this later in the function environment variables. 1—This logs all messages (INFO, WARN, CRITICAL), 2 (default)—This logs all Warning and Critical messages, 3—This Logs only Critical messages. We store CloudWatch Logs in the default streams created by the Lambda Functions.
- AutoInvokeInterval—this is the interval that we regularly invoke the Bulk-Batch-Initiator Lambda Function.
- Click Next
- Add any tags you wish to add to the stack and its resources
- Click Next
- Read and acknowledge any notices at the bottom
- Click Create Stack
Using the solution
- To on board a new security group to prefix list to Region mapping, invoke the AutoSG2PL-OnBoard Lambda function with the payload that follows replacing the information in brackets:
{ "sg": "<security group ID>", "region": "<prefix list Region>" }
- To synchronize all configured security group and prefix list pairs, invoke the AutoSG2PL-Bulk-Batch-Initiator Lambda function with an empty payload. This will invoke the Batch-Sync function for each mapping.
- To synchronize a single security group to prefix list mapping, invoke the AutoSG2PL-Batch-Sync Lambda function with the payload that follows, replacing the information in brackets:
{ "sg": "<security group ID>", "pl": "<prefix list ID>", "region": "<prefix list Region>" }
- To begin automatic synchronization of all configured security groups to prefix list mappings: navigate to Amazon EventBridge in the console (you can find a link in the CloudFormation Stack Outputs tab) and click on the rule associated with the stack (it is named <CloudFormation Stack Name>-ScheduledRule-<random string>) and click enable.
How it Works
The CloudFormation Stack and Resources
Using the CloudFormation template, we create:
- A SNS topic used to send messages about critical errors that occur during execution of the Lambda Functions.
- An S3 bucket to store the Lambda deployment package ZIPs. This allows the solution to be deployed in any AWS Region.
- An IAM role for the Lambda Function that will copy the ZIPs to the bucket created above.
- The Lambda Function to copy the ZIP files to the bucket created. It uses Boto3 to List_Objects in the source bucket, then uses the S3 Resource object in Boto3 to copy each object to the newly created bucket.
- A CloudFormation Custom Resource which executes the Lambda Function above to populate the bucket with the ZIPs we will need in the following steps.
- The Batch Sync Lambda with all the Environment Variables needed, and the associated IAM role with an inline policy.
- The OnBoarding function with its Environment Variables and associated IAM role. This has references to the Batch Sync function, so we must create it after.
- The Bulk Batch Sync Initiator Function with its Environment Variables and associated IAM role. The Bulk Batch Initiator also has references to the Batch Sync function.
- The EventBridge rule to start the Bulk Batch Initiator function at the interval desired in a disabled state and the Lambda permission necessary for the EventBridge rule to launch it. We create the rule in a disabled state to prevent undesired charges for the Bulk Batch Initiator function running prior to any configurations being made for it to synchronize. The user must enable the EventBridge rule manually.
- The CloudWatch Alarms to monitor for failed invocations of the Batch Sync and Bulk Batch Initiator functions and were alert via the SNS topic created earlier.
OnBoarding
We designed the OnBoarding Lambda function to create the prefix list in the Region that the specified security group will synchronize to, then created a parameter in parameter store to save the mapping of the security group, prefix list, and Region combination. Finally, the OnBoarding function conducts an initial sync either as part of creating the prefix list if there are less than 100 IP addresses or by triggering the Batch Sync function if there are greater than 100 IP addresses. The function has checks that make sure that the security group specified exists in the same Region and account as the solution, as well as there is not already a duplicate mapping for the security group / Region pair.
Bulk-Batch-Initiator
The Bulk Batch Initiator is a simple function that runs on a schedule. It searches parameter store by path for any configured mappings, loops through and parses the mappings to form the necessary JSON and then invokes the Batch Sync function for each mapping with the associated JSON payload. You will see there is a 1 second sleep timer between each invocation. We designed this to help alleviate the number of requests per second to the EC2 API. For larger environments, with a lot of mappings per Region or other workloads that are heavy users of the EC2 API, you may change that to a longer sleep time to prevent any issues. As with all solutions, test these in non-production environments to make sure there are no unintended effects.
Batch-Sync
The Batch Sync function is the main function of this solution. This runs to make sure IP addresses that are associated with the security group but not in the prefix list are added to the prefix list and remove any IPs that are no longer associated with the security group. It considers the maximum number of prefixes when updating a prefix list and paginates if it must add or remove over 100 prefixes. Otherwise it consolidates the number of API calls. It also provides warning when the max entries for the prefix list are approaching the safe threshold configured for the max entries per security group quota. It also manages error handling during prefix list resize events. It automatically resizes the max entries’ value in the prefix lists to accommodate the number of IPs required.
Prerequisites
For this walkthrough, you need the following:
- An AWS account
- The security group ID(s) you wish to synchronize with prefix lists
- You must have the permissions to deploy the resources in the CloudFormation template
- You should request a limit increase for “Number of Rules per Security Group” to match the maximum number of IP addresses in the prefix list multiplied by the number of times you plan to specify the prefix list in any security group (ex. prefix list has 30 IP addresses and the security group allows 80 and 443 TCP from that prefix list, then the Quota should be greater than 60 to allow some room to grow)
Considerations
- When launching the CloudFormation, the EventBridge Rule that starts the Bulk Batch Initiator function is created in a disabled state to avoid unwanted charges. Make sure to activate the rule after creating a mapping for the first time with the OnBoarding function, so the prefix lists will stay in sync with the security group(s) membership(s).
- We do not support iPv6 in this solution today.
- IP addresses are added to the prefix list as 32-bit CIDRs. Our solution does not support summarization. This is suboptimal for a large set of IP addresses.
- Inserting CIDRs that are not 32-bits in length into the prefix list can cause unintended effects with this solution. We highly recommend that you not to update the prefix lists manually.
- This solution currently runs as a batch update on a schedule, which means there is a delay between an update to an ENI regarding IP association to a security group and the prefix list being updated.
- When using a prefix list in a security group, we evaluate the packet to ensure that the source IP address matches a CIDR within the prefix list. Care should be taken with access to create or modify routes that could cause unintended access based on source IP filtering.
Cleaning up
To remove a single security group to prefix list mapping:
- Go to the parameter store console and delete the parameter for the security group and Region pair. This will stop future synchronizations from happening for that pair
- Go to the VPC Console in the Region where the prefix list is and delete the prefix list
- If you have deleted your last mapping, deactivate any EventBridge automation to invoke the BulkBatchInitiator Lambda function or you will be charged for the invocations
To remove the entire solution and avoid incurring future charges:
- Follow the steps above for each mapping that exists
- Delete the CloudFormation Stack that was launched for this solution
- Make sure all Lambda Functions, CloudWatch Alarms, parameter store parameters, the S3 Bucket and the SNS Topic that were associated were successfully deleted
Conclusion
With the code and instruction set we provided, you can now use prefix lists to reference IP addresses associated with a security group when crossing Regions, Transit Gateways, Gateway Load Balancers, or Network Firewalls. Deploy the solution today in your test environment and see how it works for you and continue adopting the principle of least privilege.