Containers
Getting started with Amazon EKS Auto Mode
This post is co-authored by Alex Kestner (Sr Product Manager, Amazon EKS), Ashley Ansari (Sr. Product Marketing Manager), Robert Northard (Principal GTM SSA Containers), and Sheetal Joshi (Principal Solution Architect, Containers).
Introduction
We announced general availability of Amazon Elastic Kubernetes Service (Amazon EKS) Auto Mode that provides a new capability streamlining Kubernetes cluster management for compute, storage, and networking. You can now get started quickly, improve performance, and reduce overhead, enabling you to focus on building applications that drive innovation by offloading cluster management to AWS.
Amazon EKS Auto Mode streamlines Kubernetes cluster management by automatically provisioning infrastructure, selecting optimal compute instances, dynamically scaling resources, continually optimizing compute for costs, patching operating systems (OS), and integrating with AWS security services. When enabled, EKS Auto Mode configures clusters capabilities with AWS best-practices included, making sure that clusters are ready for application deployment.
In this post, we cover the high-level architecture of EKS Auto Mode and provide a walkthrough for deploying a highly available, auto-scaled, sample application with EKS Auto Mode.
What’s new?
Amazon EKS has long been trusted as a secure method for running Kubernetes. Before EKS Auto Mode, despite the managed control plane, users still needed dedicated expertise and ongoing time investment to manage the infrastructure needed to run production grade Kubernetes applications. Users must perform ongoing maintenance activities from selecting and provisioning the right Amazon Elastic Compute Cloud (Amazon EC2) instances to optimize resource usage and cost, to installing and maintaining plug-ins. This is done while staying on top of cluster upgrades and OS patching to keep infrastructure secure and up-to-date, as shown in the following figure.
Fully automated cluster operations mean that EKS Auto Mode reduces the need for specialized knowledge to manage production-grade Kubernetes infrastructure, saving users significant time and effort. Users no longer need to spend time and resources on selecting and provisioning EC2 instances, optimizing resources and costs, and maintaining plugins.
When you create or update an existing EKS cluster with EKS Auto Mode enabled, Amazon EKS automatically deploys essential controllers for compute, networking, and storage capabilities inside Amazon EKS-owned AWS accounts and Amazon EKS-managed VPCs, along with managed Kubernetes control plane infrastructure.
EKS Auto Mode automatically launches EC2 instances based Bottlerocket OS and AWS Elastic Load Balancing (ELB), and provisions Amazon Elastic Block Store (Amazon EBS) volumes inside user AWS accounts and user-provided VPCs when you deploy your applications. EKS Auto Mode launches and manages the lifecycle of these EC2 instances, scaling and optimizing the data plane as application requirements change during run time, and automatically replacing any unhealthy nodes. It provides managed infrastructure without abstracting away the depth and breadth of Amazon EC2 capabilities from you, as shown in the following figure.
EKS Auto Mode allows the node capabilities that once ran as Kubernetes DaemonSets to run as system processes managed by AWS. This includes components such as service discovery, service load balancing, pod networking, block storage, and credential vending. AWS takes on the lifecycle management of these components, such as updates for security fixes, and publishes a new version of the EKS Auto Mode Amazon Machine Image (AMI) that includes updated components for the supported Kubernetes version.
Furthermore, EKS Auto Mode handles cluster upgrades and OS updates automatically by gracefully replacing the nodes, while respecting Kubernetes scheduling constraints defined for ensuring that your infrastructure remains secure and up-to-date. This significant reduction in operational overhead allows teams to focus on application development rather than infrastructure management.
Getting started
EKS Auto Mode is now available for new and existing EKS clusters running version 1.29 and above. To get started with EKS Auto Mode, you can use the new console feature ‘Quick Configuration’, which provides a new, one-click getting started experience through which you can quickly launch a cluster with sensible defaults pre-configured. Or you can use the Amazon EKS API, AWS Management, eksctl, or your preferred infrastructure as code (IaC) tooling.
In this section, we demonstrate how EKS Auto Mode streamlines deploying applications on Amazon EKS. We begin by creating an EKS cluster with EKS Auto Mode enabled, then deploy a sample retail store application. You can see how EKS Auto Mode automatically launches new nodes, sets up AWS Load Balancers, manages persistent storage requirements, and handles the application’s auto-scaling needs.
Prerequisites
The following prerequisites are necessary to complete the steps mentioned in this post:
- An AWS account: For this post, we assume you already have an AWS account with admin privileges.
- Install the following tools: Helm 3.9+, kubectl, eksctl, and AWS Command Line Interface (AWS CLI).
Creating cluster
For the purpose of this post, we use eksctl, a command-line utility tool that helps with the quick creation of an EKS cluster. The following example configuration uses eksctl to automatically generate cluster subnets for cluster infrastructure and application deployment. If you aren’t using the sample configuration, then refer to the Amazon EKS user guide for a complete list of prerequisites. These prerequisites include changes to cluster IAM roles and node IAM roles, which provide new permissions for EKS Auto Mode to manage EC2 instances in your account.
We’re enabling EKS Auto Mode with built-in managed NodePools for general-purpose and system workloads. The general-purpose NodePool provides support for launching general purpose workloads, while the system NodePool handles add-ons. Both use On-Demand EC2 instances (generation 5 or newer) from C, M, and R families with amd64 architecture. For more details on built-in NodePools, refer to the EKS Auto Mode user guide.
Wait for cluster state to become Active
.
The cluster is now ready for applications to be deployed. In the next section we demonstrate how EKS Auto Mode streamlines application deployment.
Deploying application
We use a sample retail store application where users can browse a catalog, add items to their cart, and complete an order through the checkout process. The application has several components, such as UI, catalog, orders, carts, and checkout services, along with a backend database that needs persistent block storage, modeled as Kubernetes Deployments and StatefulSets. We use Kubernetes ingress to access the application outside of the cluster and configure catalog applications to use Amazon EBS persistent storage. To demonstrate how EKS Auto Mode improves performance, provides scalability, and enhances availability, we configure the UI application to support Horizonal pod Autoscaling, pod Topology Spread Constraints, and Pod Disruption Budgets (PDBs).
Before deploying application, check the cluster state.
The nodes and pods list is empty.
Before proceeding with application deployment, create StorageClass and IngressClass. This set up makes sure that the necessary infrastructure configurations are in place to support storage and ingress requirements for applications that are deployed later. Typically, this is a step performed by the platform team once after the cluster is created.
Use helm to deploy the application. Run the following command to create a values.yaml file to specify application requirements as specified previously:
Deploy the retail store application. As you proceed with the deployment, consider the configuration in the values.yaml file, particularly the endpoints for the UI. If you’ve chosen to use a chart name that is different from the default retail-store-app, then you must update these endpoints accordingly.
EKS Auto Mode evaluates the resource requirements of these pods and determines the optimum compute to launch for your applications to run, considering the scheduling constraints you configured, including topology spread constraints. It uses the built-in general-purpose NodePool to launch nodes. Wait for the nodes to become Ready
.
On a separate terminal, watch for the application to be in the available state.
The components of the retail store applications should be in the running state.
Inspecting the catalog-mysql-ebs
StatefulSet, you can see that EKS Auto has created PersistentVolumeClaim attached to it with 30 GiB and with storageClassName
of eks-auto-ebs-csi-sc
.
EKS Auto Mode automatically created the Application Load Balancer (ALB) for the UI application. You can find the ALB name from the following command. When the ALB is ready, you can access the link in your web browser. You should see the homepage of the retail store displayed.
Scaling application
Now that you have deployed the application, you can see how EKS Auto Mode scales cluster infrastructure resources to meet the needs of the application using HorizontalPodAutoscaler (HPA) and metrics-server. In Kubernetes, an HPA automatically adjusts the number of replicas in a deployment based on observed metrics. Metrics server collects CPU and memory usage data from kubelets and exposes them to HPA through the Kubernetes API server. HPA continuously monitors these metrics and adjusts the number of replicas to match the specified target.
First, deploy the Kubernetes metrics-server.
In this example you use the ui service and scale it based on CPU usage (80%) with maxReplicas of 10. You have already applied HPA as part of the install application step. See the Auto Scaling section of values.yaml. You can confirm the Auto Scaling policy using the following command.
Generate some load to observe EKS Auto Mode scale out cluster infrastructure in response to the AutoScaling policy configured.
You have requests hitting your application, thus you can watch for new nodes launching and more UI pods running.
You can watch the HPA resource to follow its progress.
As you can see, EKS Auto Mode fully manages and dynamically scales cluster infrastructure based on application demands. You can stop the load-generator by terminating the pod. As the load generator terminates, HPA slowly brings the replica count to the minimum number based on its configuration.
Key considerations
The following are some practices to consider when deploying workloads to Amazon EKS Auto Mode:
- Configure pod disruption budgets to protect workloads against voluntary disruptions: During voluntary disruptions, such as when EKS Auto Mode disrupts an underused node, disruption budgets help control that rate at which replicas of a deployment are interrupted, helping to protect some workload capacity to continue to serve traffic or process work.
- Schedule replicas across nodes and Availability Zones for high-availability: Use pod Topology Spread Constraints to spread workloads across nodes and to minimize the chance of running multiple replicas of a deployment on the same node.
- Configure appropriate resource requests and limits: EKS Auto Mode launches EC2 instances based on the vCPU and memory requests of the workloads. Resource requests must be carefully configured otherwise resources could be over-provisioned. EKS Auto Mode doesn’t consider resource limits or usage.
- Application must handle graceful shutdowns:Your application must be able to gracefully shutdown by handling a SIGTERM signal to prevent loss of work or interrupting end user experience during voluntary disruptions. When a Kubernetes pod is decided to be evicted, a SIGTERM signal is sent to the main process of each container in the Pods being evicted. After the SIGTERM signal is sent, Kubernetes gives the process some time (grace period) before a SIGKILL signal is sent. This grace period is 30 seconds by default. You can override the default by declaring terminationGracePeriodSeconds in your pod specification.
- Avoid overly constraining compute selection: The general-purpose EKS Auto Mode NodePool diversifies across c, m, r Amazon EC2 families of different sizes to maximize the opportunity to pick a right-sized EC2 instance for a workload. For workloads with specific compute requirements, you can use well-known Kubernetes labels to allow pods to request only certain instance types, architectures, or other attributes when creating nodes.
You can learn more about application best practices in the Reliability section of the Amazon EKS Best Practices Guide.