AWS Database Blog

A serverless solution to schedule your Amazon DynamoDB On-Demand Backup

We recently released On-Demand Backup for Amazon DynamoDB. Using On-Demand Backup, you can create full backups of your DynamoDB tables, helping you meet your corporate and governmental regulatory requirements for data archiving. Now you can back up any table from a few megabytes to hundreds of terabytes of data in size, with the same performance for and availability to your production applications.

With On-Demand Backup, you can initiate a backup process on your own, but what if you want to schedule backups? Adding the power of serverless computing, you can create an AWS Lambda function that complements On-Demand Backup with the ability to set a schedule.

This blog post explains how you can create a serverless solution to schedule backups of your DynamoDB tables.

Solution overview
Let’s assume you want to take a backup of one of your DynamoDB tables each day. A simple way to achieve this is to use an Amazon CloudWatch Events rule to trigger an AWS Lambda function daily. In this scenario, in your Lambda function you have the code required to call the dynamodb:CreateBackup API operation. Setting this up requires configuring an IAM role, setting a CloudWatch rule, and creating a Lambda function. Using AWS CloudFormation, you can automate all these steps.

Following, you can find an example showing how to configure the schedule backup using CloudFormation. The following Python script and CloudFormation template are also available at this GitHub location.

Create a Python script
First, you need to create a Python script for Lambda. To do so, take the following steps:

  1. Create a file named ddbbackup.py and copy the following code. This sample code takes backups of a table name you specify while creating the CloudFormation Stack in the AWS region where the Lambda function executes. It also deletes any existing backups of that table except the last few days backups you specify as backup retention days in the CloudFormation Stack.
    # Copyright 2017-2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
    #
    # Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance with the License. A copy of the License is located at
    #
    #    http://thinkwithwp.com/apache2.0/
    #
    # or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
    from __future__ import print_function
    from datetime import date, datetime, timedelta
    import json
    import boto3
    import time
    from botocore.exceptions import ClientError
    import os
    ddbRegion = os.environ['AWS_DEFAULT_REGION']
    ddbTable = os.environ['DDBTable']
    backupName = 'Schedule_Backup_V21'
    print('Backup started for: ', backupName)
    ddb = boto3.client('dynamodb', region_name=ddbRegion)
    
    # for deleting old backup. It will search for old backup and will escape deleting last backup days you mentioned in the backup retention
    #daysToLookBackup=2
    daysToLookBackup= int(os.environ['BackupRetention'])
    daysToLookBackupL=daysToLookBackup-1
     
    def lambda_handler(event, context):
    	try:
    		#create backup
    		ddb.create_backup(TableName=ddbTable,BackupName = backupName)
    		print('Backup has been taken successfully for table:', ddbTable)
    		
    		#check recent backup
    		lowerDate=datetime.now() - timedelta(days=daysToLookBackupL)
    		upperDate=datetime.now()
    		responseLatest = ddb.list_backups(TableName=ddbTable, TimeRangeLowerBound=datetime(lowerDate.year, lowerDate.month, lowerDate.day), TimeRangeUpperBound=datetime(upperDate.year, upperDate.month, upperDate.day))
    		latestBackupCount=len(responseLatest['BackupSummaries'])
    		print('Total backup count in recent days:',latestBackupCount)
    
    		deleteupperDate = datetime.now() - timedelta(days=daysToLookBackup)
    		print(deleteupperDate)
    		# TimeRangeLowerBound is the release of Amazon DynamoDB Backup and Restore - Nov 29, 2017
    		response = ddb.list_backups(TableName=ddbTable, TimeRangeLowerBound=datetime(2017, 11, 29), TimeRangeUpperBound=datetime(deleteupperDate.year, deleteupperDate.month, deleteupperDate.day))
    		
    		#check whether latest backup count is more than two before removing the old backup
    		if latestBackupCount>=2:
    			if 'LastEvaluatedBackupArn' in response:
    				lastEvalBackupArn = response['LastEvaluatedBackupArn']
    			else:
    				lastEvalBackupArn = ''
    			
    			while (lastEvalBackupArn != ''):
    				for record in response['BackupSummaries']:
    					backupArn = record['BackupArn']
    					ddb.delete_backup(BackupArn=backupArn)
    					print(backupName, 'has deleted this backup:', backupArn)
    
    				response = ddb.list_backups(TableName=ddbTable, TimeRangeLowerBound=datetime(2017, 11, 23), TimeRangeUpperBound=datetime(deleteupperDate.year, deleteupperDate.month, deleteupperDate.day), ExclusiveStartBackupArn=lastEvalBackupArn)
    				if 'LastEvaluatedBackupArn' in response:
    					lastEvalBackupArn = response['LastEvaluatedBackupArn']
    				else:
    					lastEvalBackupArn = ''
    					print ('the end')
    		else:
    			print ('Recent backup does not meet the deletion criteria')
    		
    	except  ClientError as e:
    		print(e)
    
    	except ValueError as ve:
    		print('error:',ve)
    	
    	except Exception as ex:
    		print(ex)
  2. Add the ddbbackup.py to a .zip file.
  3. Upload the .zip file to an Amazon S3 bucket.

Deploy the Lambda function
Now your code is ready. Let’s deploy this code using AWS CloudFormation:

  1. Copy the cloud formation template from Github. This template performs these operations:
    1. Creates an IAM role with permissions to perform DynamoDB backups.
    2. Creates a Lambda function using the Python script that you created earlier. The template asks for the Amazon S3 path for the zip file, name of your DynamoDB table and Backup retention as a parameter during stack creation.
    3. Schedules a CloudWatch Event rule to trigger the Lambda function daily.
  2. Go to the AWS CloudFormation console and choose the desired AWS Region. In this example, I use the N. Virginia (US-EAST-1) Region.
  3. Choose Create Stack.Create Stack
  4. On the Select Template page, choose File, and then choose the CloudFormation template that you created earlier (DynamoDBScheduleBackup.json).
  5. Choose Next.
  6. Enter a value for Stack Name.
  7. Choose the names of the S3 bucket, file where you uploaded the Python script, table name and backup retention.
  8. Choose Next, and then choose Next on the Options
  9. Select I acknowledge that AWS CloudFormation might create IAM resources, and then choose Create.

At this stage, CloudFormation starts creating the resources based on the template that you uploaded earlier.After few minutes, you should see that the stack has been completed successfully.

Check the resources created
Let’s check all the resources created by the CloudFormation template:

  1. Open the CloudWatch console, and then choose Rules. On the Rules page, you see a new rule name that starts with the CloudFormation stack name that you assigned at CloudFormation stack creation.
  2. Select the rule. On the summary page, you can see that the rule is attached to a Lambda function and triggers the function once a day.
  3. Choose your resource for Resource name on the Rules summary page. The Lambda console opens for the function that CloudFormation created to perform the DynamoDB table backup.
  4. Choose Test to test whether the Lambda function can take a backup of your DynamoDB table.
  5. On the Configure test event page, type an event name, ignore the other settings, and choose Create.
  6. Choose Test It should show a successful execution result.
  7. Open the DynamoDB console, and choose Backups. The Backups page shows you the backup that you took using the Lambda function.
  8. Wait for the CloudWatch rule to trigger the next backup job that you have scheduled.

Summary
In this post, I demonstrate a solution to schedule your DynamoDB backups using Lambda, CloudWatch, and CloudFormation. Although the scenario I describe is for a single table backup, you can extend this example to take backups of multiple tables.

To learn more about DynamoDB On-Demand Backup and Restore, see On-Demand Backup and Restore for DynamoDB in the DynamoDB Developer Guide.


About the Authors

Masudur Rahaman Sayem is a solutions architect at Amazon Web Services. He works with our customers to provide guidance and technical assistance on database projects, helping them improving the value of their solutions when using AWS.