AWS Storage Blog
Restoring on-premises applications to AWS from Amazon EBS Snapshots created by EBS direct APIs
Incremental, point-in-time copies of data can be a secure and cost effective tool anchoring disaster recovery, data migration, and compliance solutions. Amazon EBS Snapshots are how EBS customers leverage point-in-time copies of their data stored on AWS, and you can use Snapshots on premises too. In December 2019, AWS introduced Amazon EBS direct APIs, providing programmatic access to create and manage EBS Snapshots of block storage data regardless of whether it resides on premises or within AWS. These APIs are used by backup and recovery partners, Amazon Machine Image (AMI) developers, and security partners to simplify their workflows and increase the performance of their business continuity, disaster recovery, AMI creation, and security solutions.
You can invoke the EBS direct APIs from Amazon Elastic Compute Cloud (Amazon EC2) instances, AWS Lambda functions, or containers. You can use EBS direct APIs to create EBS snapshots of your on-premises data. You can also leverage existing recovery capabilities like Fast Snapshot Restore (FSR) to quickly recover data from snapshots into EBS volumes.
In this blog post, we walk you through a simple solution to create an Amazon EBS Snapshot of an on-premises block storage device. We also demonstrate restoring your application in AWS from a point-in-time DR copy and validating that the application was successfully recovered.
Solution overview
For this blog post, our example application is running Apache Solr on an Amazon EC2 instance store volume that is simulating an on-premises workload. We use the AWS Labs coldsnap utility, which is an open-source command-line interface tool that uses the EBS direct APIs to create, upload, and download snapshots. Coldsnap can be used to simplify snapshot handling in an automated pipeline.
We designed a simple solution to demonstrate the capabilities of EBS direct APIs using the coldsnap tool. The solution involves directly creating an EBS snapshot from an on-premises application and using the EBS snapshot to recover the on-premises application to AWS. The components of the solution are listed below:
- A source host, which is an Amazon EC2 instance with instance store volumes that simulates an on-premises host
- An Apache Solr application running on the source host that will be recovered to AWS
- A destination host, which is an EC2 instance, to which Apache Solr application will be recovered
- Coldsnap, an open-source command-line interface tool that uses Amazon EBS direct APIs to create, upload, and download EBS snapshots
The first step of the solution uses coldsnap to create an Amazon EBS snapshot of the Amazon EC2 instance store volume on which the Apache Solr application is running. Figure 1 illustrates the solution architecture. Coldsnap uses the following EBS direct APIs over Secure Shell (SSH) to create the EBS snapshot:
- StartSnapshot – To create a new snapshot
- PutSnapshotBlock – To add a block of data to a snapshot
- CompleteSnapshot – To complete the snapshot after writing all the data to the snapshot
Figure 1: Process of using the capabilities of EBS direct APIs using coldsnap tool
After coldsnap creates an Amazon EBS snapshot of the Amazon EC2 instance store volume on the source host, you can create an EBS volume from the snapshot as you normally would, attach it to the destination host, and restart Apache Solr to create a point-in-time copy the Apache Solr application running on the source host.
Prerequisites for the solution
Here are a few prerequisites, including user AWS Identity and Access Management (IAM) permissions and security access setup:
- An AWS account with appropriate IAM permissions to launch EC2 instances and manage EBS volumes.
- Appropriate IAM policies to use EBS direct APIs with snapshots. Here is an example of policy permissions:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ebs:StartSnapshot",
"ebs:PutSnapshotBlock",
"ebs:CompleteSnapshot"
],
"Resource": "arn:aws:ec2:<Region>::snapshot/*"
}
]
}
- EC2 instance Security groups allowing access to Apache Solr default port 8983 and SSH port 22.
Solution configuration
The example solution is configured as follows:
- Source host: EC2 instance (c5d.large) running Red Hat Enterprise Linux 8 (HVM), 1 x 50 (NVMe SSD) Instance Store Volume
- Open-source Apache Solr with index data running on the source host
- Destination Host: EC2 instance (m5.large) running Red Hat Enterprise Linux 8 (HVM), for application recovery in AWS
- Coldsnap installed on the source host instance as per the instructions in the GitHub repository readme file
How to set up the solution
In this section, we give you step-by-step instructions on how to set up the solution.
Step 1: Create the EC2 instances corresponding to the source and destination hosts.
We create two EBS-optimized EC2 Nitro instances, with the default gp2 EBS volume as the root device as seen in Figure 2.
- c5d.large as the source host in us-west-2b with 1x 50G NVMe SSD instance store volume
- m5.large as the destination host in us-west-2a
Figure 2: Source and Destination EC2 Instances
Step 2: (Source host) Partition and mount the instance store NVMe SSD.
On the source host, we use the Red Hat Linux parted utility to create and mount an XFS partition of size 5GiB. We mount the partition as /solr for application data and run the lsblk command to list and view the partitions and mount points. An example of this is shown in Figure 3.
Figure 3: lsblk command to list and view the partition and mount points
Step 3: (Source host) Install Apache Solr application and index data.
We install Apache Solr and application data on the /solr mount point of the source host. We use defaults for installation and configuration as explained in SolrCloud on AWS EC2. For application data, we use sample data file of different AWS database offerings. A code snippet of the sample index file is shown in Figure 4:
Figure 4: Sample Solr index file of AWS database offerings
Apache Solr application indexes the sample data after it is started. We verify the application by launching a browser and connecting to the default Apache Solr port 8983. Figure 5 shows the Apache Solr application query view of the indexed sample data file on the source host.
Figure 5: Apache Solr application query view of the indexed sample data file on the source host
Step 4: (Source host) Create an EBS snapshot of instance store volume using coldsnap.
On the source host, we install coldsnap, the command-line tool that uses Amazon EBS direct APIs to upload and download snapshots per the steps outlined at the AWS Labs GitHub coldsnap repository. We use the output of the lsblk command to determine the block device name corresponding to /solr, which contained the Apache Solr application and index data.
Note : You should use application specific methods such as quiescing your application to ensure you get a consistent point-in-time copy of your data before using EBS direct APIs to create a snapshot.
We then run coldsnap to create an Amazon EBS snapshot of the instance store volume block device on which /solr was mounted. Initially, the EBS snapshot was created in a pending state in the Amazon EC2 console as illustrated in Figure 6.
Figure 6: EBS snapshot in pending state
After coldsnap finished the upload of the snapshot of the block device corresponding to /solr, the snapshot status changed from pending to completed in EC2 console as shown in Figure 7. At this point, an EBS volume can be created from the snapshot in any Availability Zone (AZ) in the AWS Region.
Figure 7: EBS snapshot in completed state
Step 5: (Destination host) Create and attach the EBS volume recovered from the snapshot.
We create an Amazon EBS volume from the snapshot created in Step 4 as described at Create a volume from a snapshot and attach it to the destination host per the procedure explained at Attach an Amazon EBS volume to an instance.
In the Amazon EC2 console, we verify that the volume was attached to the destination host by navigating to the Storage tab under the destination host, as illustrated in Figure 8.
Figure 8: Verification of correct volume attached to the destination host
Step 6: (Destination host) Mount the EBS volume
In this step, we log in to the destination host using SSH and mount the newly created Amazon EBS volume and verify the mount point using the lsblk command. The newly attached EBS volume was exposed as /dev/nvme4n1. We then navigate to the /solr/ path and inspected the files and content as seen in Figure 9.
Figure 9: Verification of content in the mounted volume of destination host
Step 7: (Destination host) Restart Apache Solr application and validate application data
We restart the Apache Solr application in the solr-8.6.2 directory and verified the content indexed in Step 3. We are able to query the results without re-indexing the data. The content is available on same port 8983 as shown in Figure 3, on the destination host.
Cleaning up
To avoid incurring unwanted AWS costs when performing these steps, please terminate the AWS resources created for this demonstration, which include Amazon EC2 instances, Amazon EBS volumes, and the Amazon EBS snapshot.
Conclusion
In this blog post, we demonstrated a solution to create Amazon EBS snapshots of any block storage device, either on premises or of Amazon EC2 Instance store volumes, using Amazon EBS direct APIs. We used the coldsnap utility, which is an open-source command-line interface tool that uses EBS direct APIs to create, upload, and download snapshots. You can use EBS Snapshots to manage and enhance operational recovery, disaster recovery, migrating data across Regions and accounts, and to improving backup compliance.
For more information, refer to EBS direct APIs documentation and the coldsnap GitHub repository. If you need further assistance, please talk to your AWS account team.
Thank you for reading this blog post! If you have any comments or questions, don’t hesitate to leave them in the comments section.