Containers

Optimize your container workloads for sustainability

This blog was authored by Karthik Rajendran, Senior Solutions Architect (AWS) and Isha Dua, Senior Solutions Architect (AWS). 

The software architect’s job is mostly one of trade-offs, weighing the considerations of different approaches and then choosing the one that strikes the best balance. Some architects are surprised to find that, in the AWS Cloud at least, architecting for environmental sustainability doesn’t have to mean sacrificing functionality or higher bills.

In this blog post, we provide guidance on how to build and run your container images using less storage and fewer compute resources. You will find that you don’t need to remove features or strip functionality to gain these sustainability benefits.

This post assumes a moderate knowledge of containers. If you are new to containers, check out What is Containerization? before continuing. Also, check out the Well-Architected Framework for architecture best practices to follow during the container image build.

Now let’s take a look at how to minimize your environmental impact, while working with containers on AWS.

Build an Optimized Image

In this section, you will learn how to pick your parent image wisely and remove unneeded images from registries, so that you are minimizing the required storage. Then you will see how to minimize the compute resources used when building your images. These steps will lower your overall environmental footprint and improve cost-effectiveness.

Pick your parent image carefully

The energy it takes to power workloads, and store data in the cloud, impacts your environmental footprint. Large images take longer to start due to network transfer and decompression time. Minimize your environmental impact by keeping the image size as small as possible.

Consider a python image built from a standard python:3.11.5 parent image. The Docker Hub definition shows that this image is based on Debian 12. The overall compressed size for the linux/arm64/v8 image is 352MB.

Does your application need all of the software that comes in that Debian image? Does it need a C compiler, for example? If not, you can base your image off an alternate, slimmer parent for decreased file sizes.

A common choice for a smaller base is the slim variant (in our example, python-3.11.5-slim). Slim excludes many common packages normally not needed by running containers, like man pages and documentation. In our example, slim has a size of only 48MB. Alpine variants (e.g., python-3.11.5-alpine) are based on the much smaller Alpine Linux distro, and have a size of less than 19MB. You can even go further with Distroless images, eliminating everything except what’s strictly necessary for runtime. A Distroless image is less than 2% of the size of the full Debian image. These differences may seem small by themselves, but they matter at scale.

Purge unneeded versions from image registries

After images are built, they are stored in registries like Amazon Elastic Container Registry (Amazon ECR). Customers typically store many versions of an image so they can easily roll back if needed. Large projects can produce thousands of container images daily, increasing repositories size despite layer optimization.

To address the storage of unneeded or obsolete images, Amazon ECR lifecycle policies can automatically purge docker images based on their age or overall count. For example, you could decide to remove images after 60 days, or keep a maximum of 100. Note that Amazon ECR does not know that an image is in use, so first check its usage with the AWS Command Line Interface (AWS CLI) before enacting lifecycle policies.

Minimize compute time while building

The compute resources used to build your images have an environmental impact. By minimizing the time it takes for your images to be built, you optimize for cost and decrease your environmental footprint too.

Continuous integration and delivery (CI/CD) systems are used to automatically build software when commits are made to a source code control system. Some of these systems run on servers that are operating continually, even when there is nothing to build. To decrease the environmental footprint of building your images, use a CI/CD system that only uses compute resources when needed, such as Amazon Elastic Compute Cloud (Amazon EC2) instances.

AWS CodeBuild only runs after it is triggered by a code commit. It spins up an Amazon EC2 instance for the build environment, downloads source code, builds the image, pushes it to Amazon ECR and, finally, terminates the instance. You only pay for those minutes the build is running. Because compute is only provisioned when it’s needed, this approach improves the sustainability posture of your build process.

AWS CodeBuild can also consider container image layer caching. Making a change to a layer automatically invalidates the cache of all the following layers. However, in everyday development most frequent changes happen in lower layers, this caching strategy can decrease build times. Many engineers structure their Dockerfile carefully, avoiding frequent cache invalidations. Enable Docker layer caching in AWS CodeBuild to minimize the resources used to build your containers.

Run your container sustainably

In this section, you will learn how to run your containers sustainably in the AWS Cloud, by:

  • Selecting the optimal image type for your clusters
  • Maximizing your compute utilization
  • Identifying best cases for Amazon EC2 Spot Instances
  • Enabling auto scaling
  • Leveraging AWS Lambda in event-driven applications

Pick your Amazon EC2 instance type carefully

There are many ways to run containers in AWS, based on your workload requirements and operational preferences. From a container orchestration perspective, you have options such as Amazon Elastic Kubernetes Service (Amazon EKS) or Amazon Elastic Container Service (Amazon ECS). Both of these services allow you to manage a cluster of Amazon EC2 instances upon which you deploy your containers. Select an instance family and size that is appropriate for your workload, so as to minimize waste.

You could also consider running your containers on AWS Graviton instances. AWS Graviton-based Amazon EC2 instances use up to 60% less energy than comparable EC2 instances for the same performance. They also provide the best price-performance for cloud workloads running on Amazon EC2.

AWS Graviton processors are custom-built by AWS to deliver the best price performance for cloud workloads. They use the Arm (or arm64) instruction set. Most of the container ecosystem supports both x86 and Arm architectures. However, you may have your own application code that needs to be built for Arm. Running the open-source Porting Advisor for Graviton over your source code can help you understand the scope of changes you may need to make.

Many companies have both x86 and Arm-based Amazon EC2 instances in production, and engineering teams still develop on x86-based processors. If this is the case, you can build images for both targets. The easiest way to do this is to build a multi-architecture Docker image, where you have separate architecture-specific images, and a manifest that points to these different images.

Maximize your utilization with AWS Fargate

When you manage your own nodes with Amazon ECS and Amazon EKS, you need to maintain a collection of the running instances that comprise the cluster, and also distribute running containers. For clusters supporting multiple tasks of differing sizes or for workloads with a lot of churn, this can be a tricky bin-packing problem to solve.

AWS Fargate is a serverless compute engine for your containers. When you launch a container with AWS Fargate, you specify the needs of the task, such as how many virtual CPUs (vCPUs), how much memory, how much on-container storage, and Fargate handles the compute capacity for your container. Fargate also enables higher AWS server utilization by allowing precise allocation of compute resources to each container’s specific needs.

It’s important to set both hard and soft memory requirements, and to carefully specify the needed vCPUs for your workload. Fargate will reserve space for your container based on these settings, so the more tightly you scope them, the more efficiently and cost effectively your workload will run. If you are unsure about how much CPU, memory, and storage you need, use AWS Compute Optimizer.

Use Spot Instances when possible

If you can be flexible about when your containers run, or if they are fault-tolerant, and interruptible, then consider using Amazon EC2 Spot Instances. Amazon EC2 Spot Instances let you take advantage of unused EC2 capacity, which helps AWS improve data center utilization. EC2 Spot Instances provide up to a 90% discount compared to On-Demand prices and can be used for various stateless, fault-tolerant, or flexible applications. EC2 Spot Instances offers the ability to run workloads with significant cost savings.

The tradeoff to using EC2 Spot Instances is that, in exchange for these deep discounts, your instances may be reclaimed by the AWS Cloud whenever it needs the capacity back. Each instance gets a two-minute warning prior to interruption. After receiving this signal, your container can stop accepting new requests, close any open resources, and do anything else needed to responsibly exit.

If you are processing data in chunks, and you have some flexibility as to when the overall job needs to be completed, then EC2 Spot Instances and the AWS Batch service can make an excellent match.

Auto scale to minimize waste

For the lowest possible environmental footprint, consider minimizing the number of running containers that are not doing useful work. AWS Auto Scaling can automatically add or remove compute resources to your workload to support demand. This minimizes your resource consumption and cost.

For AWS Fargate-based workloads, it is especially easy to auto scale. You set a metric that you want to watch, the minimum and maximum number of tasks you want, and set some parameters that describe how you want the scaling accomplished. Then, you are done as Fargate Auto Scaling takes care of the rest.

Let’s say your containers provide the backend API that is used by a shopping site. For Amazon ECS, you use ECS Cluster Auto Scaling for the instances and ECS Service Auto Scaling for the containers. You must properly configure both. When scaling up ECS Service Auto Scaling will add tasks as needed until the existing capacity is filled. Then, ECS Cluster Auto scaling can notice that the cluster is fully subscribed and add instances. Once these instances are ready and registered with ECS, the Amazon ECS Service Auto Scaling can place the new tasks. Amazon ECS Service Auto Scaling will then terminate tasks until the average utilization is back within the range you have set. Because you are only consuming compute to meet your actual need, you are running your containers sustainably.

If you are using Amazon EKS, you have several choices. You can use Karpenter, Cluster Autoscaler, Horizontal Pod Autoscaler (HPA), or Vertical Pod Autoscaler (VPA). Consider using Karpenter if, you are looking to launch right-sized Amazon EC2 instances in response to unschedulable pods. This means your compute usage is tracked very closely to your workload needs, limiting your environmental footprint and minimizing cost at the same time. You will still need to manage and set the right vCPU and memory, among other components described under the AWS shared responsibility model.

Consider Lambda for event-driven workloads

If your workload is event-driven, consider running containers in AWS Lambda. This serverless technology executes containers in response to triggers like HTTP requests, Amazon Simple Storage Service (Amazon S3) uploads, or Amazon Simple Queue Service (Amazon SQS) queues. You are charged for the amount of memory, vCPUs used, and for the duration your workload is running.

As no resources are consumed when idle, Lambda eliminates over-provisioning, enabling sustainable computing.

To use AWS Lambda for your workloads, you will have to make some changes to the Dockerfile for your container and modify the way it runs. Depending on what your container executes and how it is built, you can choose a specific AWS Lambda base image and provide a different container entry point. You will also need to connect your AWS Lambda to your event sources to run at the right time.

Running your containers in Lambda can be a good choice for workloads that are already part of an event-driven architecture, such as processing work that shows up in queues. But for steady-state applications, or for workloads where a single event takes longer to complete than the fifteen-minute limit of AWS Lambda, it may not be a good fit. Consider AWS Batch for these long running workloads.

On Amazon EKS you can also use KEDA to scale applications based on events.

Conclusion

Running a workload sustainably does not mean sacrificing business value – it means producing business value with the smallest possible impact on the environment. Architecting your solutions to reduce their environmental impact can often lead to cost optimization as well.

In this article, we shared several ways that engineers can build efficient containers and run them sustainably on AWS by minimizing image size, leveraging serverless computing, and right-sizing infrastructure:

  • Pick your parent image carefully.
  • Remove unneeded versions from image registries.
  • Minimize compute time spent building.
  • Pick your instance type carefully.
  • Maximize your utilization with AWS Fargate.
  • Use EC2 Spot Instances when possible.
  • AWS Auto Scaling to minimize waste.
  • Consider AWS Lambda for event-driven workloads.

Call to Action

Ready to make your container workloads more sustainable and cost-effective? Take the first step towards optimizing your AWS infrastructure today. Here are some actionable ways to get started: