AWS Database Blog

A serverless solution to monitor the storage of your Amazon DynamoDB tables

If you use Amazon DynamoDB as your application’s NoSQL database service, you might want to track how much storage your DynamoDB tables are using. DynamoDB publishes its service metrics to Amazon CloudWatch. CloudWatch helps you monitor and analyze those metrics, set alarms, and react automatically to changes in your AWS resources. Currently, DynamoDB sends many useful metrics to CloudWatch, including ThrottledRequests, ConsumedWriteCapacityUnits, and ConsumedReadCapacityUnits.

It’s important that you analyze the storage usage of your DynamoDB tables to get a historical overview of how much storage your DynamoDB workload uses, and to help control storage costs. For example, you might want to use Time to Live (TTL) to expire unnecessary or stale data. You could then use streams to archive that data to Amazon S3 or Amazon Glacier. In this post, I explain how you can monitor your DynamoDB table’s use of storage.

Solution overview

DynamoDB updates your table’s storage use every six hours. You can get that information by using the DynamoDB DescribeTable API. After determining the table’s size, you can create a custom CloudWatch metric to push the data to CloudWatch. In this solution, I show how to create a one-click deployment architecture to monitor your DynamoDB table’s storage use by using AWS CloudFormation, which you will use to implement an infrastructure as code model.

The following Python script and AWS CloudFormation template are also available in this GitHub repository.Solution overviewHere is how the process works, as illustrated in the preceding diagram. (Be sure to deploy this solution in every AWS Region where you have DynamoDB workloads and to continue monitoring the storage use of your DynamoDB tables.)

  1. Amazon CloudWatch Events executes an AWS Lambda function on a scheduled basis.
  2. The Lambda function checks the size of all DynamoDB tables in a Region by using the DescribeTable API.
  3. The Lambda function stores the table size information in a custom CloudWatch metric.
  4. Lambda uses a custom AWS Identity and Access Management (IAM) role created by an AWS CloudFormation template to access DynamoDB and CloudWatch.
  5. AWS CloudFormation deploys the whole solution, including creating the IAM role and the Lambda function, and scheduling the event.

Monitor your DynamoDB table’s storage use

Follow these steps to monitor your DynamoDB table’s storage use:

  1. Create a file called cddbstoragemon.py, and paste the following code into the file. This example code calls the DescribeTable API for all the tables in a particular AWS Region, reads TableSizeBytes from the API response, and puts that information in a custom CloudWatch metric called DynamoDBStorageMetrics.
    #Python Code. Version: Python 3.6
    # Copyright 2017-2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
    #
    # Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance with the License. A copy of the License is located at
    #
    #    http://thinkwithwp.com/apache2.0/
    #
    # or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
    # version 0.1
    
    
    from __future__ import print_function
    from datetime import date, datetime, timedelta
    import json
    import boto3
    import time
    from botocore.exceptions import ClientError
    import os
    
    ddbRegion = os.environ['AWS_DEFAULT_REGION']
    ddbTable = 'hashrange'
    ddbClient = boto3.client('dynamodb', region_name=ddbRegion)
    
    
    cwClient = boto3.client('cloudwatch')
    
    def lambda_handler(event, context):
        
        response = ddbClient.list_tables()
        if 'LastEvaluatedTableName' in response:
            LastEvaluatedTableName = response['LastEvaluatedTableName']
        else:
            LastEvaluatedTableName = ''
        
        listTables=response['TableNames']
        for tablename in listTables:
            responseTableSize= ddbClient.describe_table(TableName=tablename)
            TableSizeBytes=responseTableSize['Table']['TableSizeBytes']
            print(tablename,'-',responseTableSize['Table']['TableSizeBytes'])
            responseCW = cwClient.put_metric_data(Namespace='DynamoDBStorageMetrics',
                MetricData=[
                    {
                        'MetricName': 'StorageMetrics',
                        'Dimensions': [
                            {
                                'Name': 'tablename',
                                'Value': tablename
                            },
                        ],
                        'Timestamp': datetime.now(),
                        'Value': TableSizeBytes,
                        'Unit': 'Bytes',
                        'StorageResolution': 1
                    },
                ]
                )
            
    			
        while (LastEvaluatedTableName != ''):
            response = ddbClient.list_tables(ExclusiveStartTableName=LastEvaluatedTableName)
            if 'LastEvaluatedTableName' in response:
                LastEvaluatedTableName = response['LastEvaluatedTableName']
            else:
                LastEvaluatedTableName = ''
                
            listTables=response['TableNames']
            for tablename in listTables:
                responseTableSize= ddbClient.describe_table(TableName=tablename)
                TableSizeBytes=responseTableSize['Table']['TableSizeBytes']
                print(tablename,'-',responseTableSize['Table']['TableSizeBytes'])
        
                responseCW = cwClient.put_metric_data(Namespace='DynamoDBStorageMetrics',
                MetricData=[
                    {
                        'MetricName': 'StorageMetrics',
                        'Dimensions': [
                            {
                                'Name': 'tablename',
                                'Value': tablename
                            },
                        ],
                        'Timestamp': datetime.now(),
                        'Value': TableSizeBytes,
                        'Unit': 'Bytes',
                        'StorageResolution': 1
                    },
                ]
                )
  2. Zip the ddbstoragemon.py file, and save it as ddbstoragemon.zip.
  3. Upload the .zip file to an Amazon S3

Deploy your code using AWS CloudFormation

Now that the code is in Amazon S3, deploy the entire solution by using AWS CloudFormation:

  1. Copy the CloudFormation template from GitHub, and save the file as ddbStoreMonCF.json. The code creates the following:
    1. An IAM role to access the DynamoDB DescribeTable API to get the DynamoDB table size and store that information in CloudWatch.
    2. A Lambda function using the Python script that you stored in Amazon S3.
    3. A CloudWatch Events rule to trigger the Lambda function every six hours.
  2. Sign in to the AWS Management Console, and navigate to the AWS CloudFormation console.
  3. Choose Create Stack.
  4. On the Select Template page, choose Upload a template to Amazon S3. Choose Browse, and then choose the ddbStoreMonCF.json template that you created earlier.
  5. Choose Next. Type the Stack name, your Amazon S3 BucketName, and the FileName of the .zip file that you uploaded earlier to S3.Screenshot to specify the stack details
  6. Choose Next. Keep the default settings on the Options page, and choose Next.
  7. Select the I acknowledge that AWS CloudFormation might create IAM resources check box, and choose Create.Select the 'I acknowledge that AWS CloudFormation might create IAM resources.' check box, and choose 'Create'

AWS CloudFormation creates the resources based on the template definition. After a few minutes, you should see that CloudFormation has created the stack successfully, as shown in the following screenshot.Screenshot of the stack created successfully

Test the configuration

The CloudWatch rule that was created by the AWS CloudFormation template will trigger every six hours and store the associated metrics in CloudWatch. However, let’s do a quick test to confirm that the configuration works:

  1. Navigate to the CloudWatch console, and in the navigation pane, choose Rules.
  2. Search for the rule that has the AWS CloudFormation stack ddbstoragemon in its name, and choose it.Screenshot of the CloudWatch rule details
  3. The rule has been scheduled to trigger a Lambda function every six hours. Choose the Resource name to open the Lambda function created by AWS CloudFormation.
  4. In the AWS Lambda console, choose Test. In the Configure test event window, type an event name, accept the default value, and choose Create.
  5. Choose Test to run the function.Choose Test to run the function
    The Lambda function checks all of your DynamoDB table’s storage metrics for that AWS Region and puts that information in a custom CloudWatch metrics namespace, DynamoDBStorageMetrics.
  6. Return to the CloudWatch console. On the All metrics tab, under Custom Namespaces, choose DynamoDBStorageMetrics.Select DynamoDBStorageMetrics on the "All metrics" tab
  7. Choose TableName, and select a table from the list to see the storage use. In this example, I ran this solution for few days to get metrics about my production tables.

The following screenshot shows storage metrics from the last week for the hashrange1 table:Screenshot of the storage metrics from the last week for the hashrange1 table

And the following screenshot shows the current size of the hashrange1 table:Screenshot of the current size of the hashrange1 table

These metrics can help you to take corrective action, for example:

  1. You can see if your table has experienced sudden growth. If this is the case, you should check with your development team to understand whether the growth represents “the new normal,” or if recent code changes might have introduced issues inadvertently.
  2. If a table exceeds the acceptable storage size, you can configure an alarm on the associated CloudWatch metrics.

Summary

In this post, I demonstrated how to deploy a serverless application to monitor the storage of your DynamoDB tables. You can enhance this solution by, for example, sending Amazon SNS notifications from Lambda to your operations team when a table exceeds a specified size.

If you have questions about the implementation of this solution, submit them in the Comments section below.


About the Author

Masudur Rahaman Sayem is a solutions architect at Amazon Web Services. He works with AWS customers to provide guidance and technical assistance about database projects, helping them improve the value of their solutions when using AWS.