Containers
Multi-Region Disaster Recovery with Amazon EKS and Amazon EFS for Stateful workloads
Introduction
Amazon Elastic File System (EFS) is a managed storage service that can be used to provide shared access to data for Kubernetes Pods running across compute nodes in different Availability Zones (AZ) managed by Amazon Elastic Kubernetes Service (EKS). Amazon EFS supports native replication of data across AWS Regions. This feature helps in designing a multi-Region disaster recovery (DR) solution for mission-critical workloads on your EKS clusters.
The Container Storage Interface (CSI) is a standard that enables you to expose various storage systems to your Pods. This allows you to run stateful workloads on your Kubernetes clusters. CSI accomplishes this task by mounting Persistent Volumes (PV) to the Pods and by keeping the lifecycle of Pods completely independent from the lifecycle of the PVs. The Amazon EFS CSI Driver is a solution that configures Kubernetes PVs in the form of access points in the EFS file system. Developers use Persistent Volume Claims (PVC) to mount the persistent volumes to the Pods.
In this post we discuss how to achieve business continuity in AWS by using Amazon EFS and Amazon EKS across AWS Regions. The solution we propose corresponds to the Pilot light and Warm standby strategies defined in the Disaster Recovery of workloads on AWS whitepaper.
Challenges
Amazon EFS implements a unique access point for each PV in a Kubernetes cluster. A File System Access Point (FSAP) ID must be specified for each volume.
- The FSAP can be manually defined for each PV using Static Provisioning. However, this can become time consuming for developers or storage administrators, as each access point must be created in Amazon EFS prior to creating a PVC in Kubernetes.
- Dynamic Provisioning enables developers to create PVCs without the need to have a provisioned access point in advance as it creates access points on-demand using the file system ID of the EFS file system.
Although dynamic provisioning makes the developer experience much better, there is a challenge when using Amazon EFS replication. Amazon EFS replication replicates all of the data in the file system but not the FSAPs. Therefore it limits you to use static provisioning in each EKS cluster. This results in having complex DR runbooks. Because whenever you want to failback to the primary Region, you need to reconfigure the unique FSAP IDs in the PVs manually in that Region. This is really hard to maintain and error-prone. Therefore, we need a solution that is abstracted from the Kubernetes layer but still can make sure of shared access to the same dataset from any EKS cluster.
Solution overview
The solution uses two AWS Regions, with each having an EKS cluster and an EFS file system. We use the EFS CSI driver in each EKS cluster. We kept Amazon Route 53 and its routing policies to route client requests in this multi-Region architecture out of the scope of this post for simplicity.
As shown in the preceding figure, we use two AZs per AWS Region providing further resiliency in the architecture. We use Amazon EFS replication to replicate data natively from Region 1 as source to Region 2 as destination. Once the initial replication completes, the data in the destination file system is read-only. Each EKS worker node accesses the EFS file system (within the same Region) through the AZ specific Amazon EFS mount target.
We achieve the abstraction between Amazon EKS and Amazon EFS layers by configuring a Kubernetes StorageClass object in each EKS cluster. You must specify the EFS file system ID of the respective Region in the StorageClass object.
You may be already thinking “So what is new? This is how you integrate Amazon EFS and Amazon EKS using the EFS CSI Driver ?” When using the StorageClass with Amazon EFS replication, there is a new parameter called subPathPattern, which is introduced in the 1.7.0 version of the EFS CSI Driver. It enables you to provide shared access to the same dataset in two different EFS file systems even though each file system has a distinct FSAP ID for that dataset. Let’s look at how it works.
In the StorageClass object manifest, you configure the subPathPattern, as shown in the following, with the PVC’s name and PVC’s namespace variable. The pattern can be made up of fixed strings and limited variables. The following pattern enables you to start using the exact same Kubernetes Deployment, Pod, and PVC manifests for both primary and DR Regions. You do not need to embed AWS Region-specific configuration parameters, such as the EFS file system ID and/or FSAP ID, into your workload manifests. All of that is abstracted by the EFS CSI Driver, which is really cool!
There is one more parameter that you need to define in conjunction with the
subPathPattern
, and it is ensureUniqueDirectory
By setting this parameter to false we make sure that the Amazon EFS driver points the PV in each AWS Region to the same directory in the EFS file system.
Next we deploy our workloads (essentially Pods) and use PVC requests.
It is worth mentioning that using GitOps for this type of architecture offers you the ability to manage the state of multiple Kubernetes clusters and EFS file systems using the DevOps best practices, such as version control, immutable artefacts, and automation.
Failover and failback
In the event of a primary AWS Region failure, you must first convert the EFS file system in the DR Region (destination) from read-only to writeable. You achieve this by simply deleting the replication configuration of the EFS file system.
We recently introduced failback capability for Amazon EFS replication. When you decide to failback to the primary Region you first replicate the recent data on the DR Region back to the primary Region’s EFS file system. Once the replication completes you can then delete the replication configuration on the EFS file system in the primary Region, essentially converting it from read-only to writeable. Finally you configure replication again to make the primary Region the new source file system of the Amazon EFS replication.
Refer to the File system failover and failback section in the Amazon EFS user guide for more information.
Code sample
We have created a GitHub repository where you can deploy the solution in this post. We walk you through the implementation steps and guide you on how to perform the failover and failback operations. The code sample is for demonstration purpose only. It should not be used in production environments. Refer to Amazon EKS Best Practices Guides and Encryption Best Practices for Amazon EFS to learn how to run production workloads using Amazon EKS and Amazon EFS.
Considerations
- Amazon EFS replication maintains an RPO of 15 minutes for most file systems. You need to factor this in when designing your application against any type of transaction or state recovery. For more information on RTO and RPO read this AWS Storage post and the Replicating file systems section in Amazon EFS User Guide.
- You can specify user ID (uid) and group ID (gid) in the StorageClass to enforce user identity on the EFS access point. If you don’t then a value from the gidRange in the StorageClass is assigned. If you do not specify the gidRange in the StorageClass then a value, as the uid and gid, selected by the driver is assigned. This is explained further in the sample GitHub repository.
- When you deploy a workload, which uses a new PVC in the primary Region, you need to wait for the Amazon EFS initial sync to be completed for that data set before deploying the same workload in the secondary Region.
- Deleting the replication configurationtakes several minutes, keep this in mind when planning your operations and the target RTO.
- Each PVC consumes an Amazon EFS access point. Check the current Amazon EFS quotas and limits before making design decisions.
Conclusion
In this post we showed you how to use Amazon EFS replication across AWS Regions for your stateful workloads running on EKS clusters, and also how to achieve disaster recovery in that kind of architecture. Amazon EFS CSI Driver provides a simple solution for replicating persistent volumes for Kubernetes across AWS Regions, enabling stateful workloads to both failover and failback.