AWS Database Blog

Upgrade Amazon DocumentDB 4.0 to 5.0 with near-zero downtime

Amazon DocumentDB (with MongoDB compatibility) is a fast, scalable, highly available, and fully managed document database service that supports MongoDB workloads. You can use the same MongoDB 3.6, 4.0, and 5.0 application code, drivers, and tools to run, manage, and scale workloads on Amazon DocumentDB without worrying about managing the underlying infrastructure. As a document database, Amazon DocumentDB makes it simple to store, query, and index JSON data.

With the launch of Amazon DocumentDB version 5.0, you can now perform a major version upgrade of your Amazon DocumentDB clusters from version 3.6 and 4.0 to 5.0 in order to unlock multiple enhancements. Amazon DocumentDB 5.0 provides support for vector search, document compression, I/O-optimized storage, faster indexing with index build status, client-side FLE, and more.

In this post, we explore how to perform an upgrade with near-zero downtime from Amazon DocumentDB 4.0 to 5.0 by using an in-place major version upgrade , Amazon DocumentDB volume cloning and AWS Database Migration Service (AWS DMS).

Existing upgrade options

Currently, Amazon DocumentDB 4.0 users can perform a major version upgrade to Amazon DocumentDB 5.0 using the following approaches:

  • mongodump and mongorestore – You can use command line utilities such a mongodump and mongorestore to create a binary backup of your Amazon DocumentDB databases and restore them to a new Amazon DocumentDB 5.0 cluster. This approach takes the Amazon DocumentDB clusters offline during the upgrade and is best suited workloads that can sustain downtime.
  • AWS DMS – You can use AWS DMS to migrate data from your existing clusters to a new Amazon DocumentDB 5.0 cluster. AWS DMS is a managed service and can be used to migrate existing data across supported sources and targets. This approach would incur additional DocumentDB I/O charges and AWS DMS usage charges. For more information, see Upgrading your Amazon DocumentDB cluster using AWS Database Migration Service.
  • In-place major version upgrade – Using this feature you can perform an in-place upgrade of your Amazon DocumentDB cluster with the need of migrating data or changing the endpoints, but requires a downtime, and the duration of the downtime is dependent on number of collections, databases, collections, and indexes. For more information, refer to in-place MVU documentation.

Solution overview

This post discusses a hybrid approach to perform a near-zero downtime major version upgrade using an in-place major version upgrade, volume cloning and AWS DMS. With this approach, you can also minimize the I/O costs and upgrade time usually associated with migrating the entire cluster data to a new endpoint. To follow along with this post, you need to perform following steps:

  1. Enable change streams on Amazon DocumentDB cluster.
  2. Clone Amazon DocumentDB cluster.
  3. Perform an in-place major version upgrade on the cloned cluster.
  4. Replicate change data capture (CDC) using AWS DMS from your Amazon DocumentDB cluster to the cloned cluster.
  5. Change your application endpoint to the cloned cluster after the replication has caught up.
  6. Perform post-upgrade cleanup.

Prerequisites

To proceed further, you must have a high level understanding of AWS DMS, in-place major version upgrade, and volume cloning. This solution will incur minimal cost in your account related to the Amazon DocumentDB change streams, AWS DMS replication instance and other resources. You can use the AWS Pricing Calculator to estimate the cost based on your configuration.

When upgrading to an Amazon DocumentDB cluster to a higher version, check for any deprecated features and operators or changes in usage methods. Run the application against the newer version and make sure that the behavior and performance is the same as in previous versions, unless there are intentional modifications in the application.

Note that switching the endpoints will still incur some downtime. We strongly recommend performing multiple dry runs of this approach on lower environments before attempting on production environment.

Enable change streams on your Amazon DocumentDB cluster

To perform a major version upgrade of your Amazon DocumentDB cluster with minimal downtime, you must enable change streams on your cluster. Change streams provide a time-ordered sequence of update events that occur within your Amazon DocumentDB cluster.

You must set the change stream log retention duration to ensure there are no missed transactions in your CDC. The default duration of change stream log retention is 3 hours; you can configure this duration to any value between 1 hour and 7 days. We recommend configuring this attribute to at least 24 hours.

The following AWS Command Line Interface increases retention period to 24 hours:

aws docdb modify-db-cluster-parameter-group \ 
     --db-cluster-parameter-group-name <parameter group name> \
     --parameters "ParameterName=change_stream_log_retention_duration,
                   ParameterValue=86400,ApplyMethod=immediate"

You can also enable the change stream from the Amazon DocumentDB console.

To enable change streams on all databases, connect to the Amazon DocumentDB cluster and enable the change stream using following command:

db.adminCommand({modifyChangeStreams: 1,database :”“,enable: true});

To confirm the creation of the change streams, list all of your cluster’s enabled change streams by using the $listChangeStreams aggregation pipeline stage. For more information, refer to Enabling Change Streams.

Clone Amazon DocumentDB cluster

With Amazon DocumentDB cloning, you can create a new clone cluster that uses the same Amazon DocumentDB cluster volume and has the same data as your Amazon DocumentDB production cluster. Creating a clone is faster and space-efficient than physically copying the data using other techniques, such as restoring a snapshot.

To create a clone of your Amazon DocumentDB production cluster, complete the following steps:

  1. On the Amazon DocumentDB console, in the navigation pane, choose Clusters.
  2. Select your Amazon DocumentDB production cluster and on the Actions menu, choose Create clone.
  3. For Cluster identifier, enter the name that you want to give to your cloned Amazon DocumentDB cluster (for example, cloned-docdb-cluster).
  4. For Instance, configuration, network settings, encryption at rest, log exports, port, and deletion protection, select the same settings as your Amazon DocumentDB cluster.
    To learn more about Amazon DocumentDB cluster and instance settings, see Managing Amazon DocumentDB Clusters.
  5. Choose Create clone to launch the clone of your chosen Amazon DocumentDB cluster.
    When the clone is created, it’s listed with your other Amazon DocumentDB clusters on the Clusters page and displays its current state. Your clone is ready to use when its state is Available.
  6. On the Clusters page, select the cloned cluster, navigate to the Configuration tab, and note the cluster’s creation time.

Perform an in-place major version upgrade on the cloned cluster

This step upgrades the cloned Amazon DocumentDB 4.0 cluster to 5.0 without migrating data or changing endpoints. You will not incur any additional charges to perform an in-place major version upgrade on your cloned cluster.

Ensure you are completing all the prerequisites steps before performing an in-place major version upgrade. For more information, refer to Amazon DocumentDB in-place major version upgrade.

Subscribe to the cloned cluster’s maintenance events by following the steps in Subscribing to Amazon DocumentDB Event Subscriptions. Then complete the following steps to upgrade the cluster:

  1. On the Clusters page on the Amazon DocumentDB console, select the cloned cluster and on the Actions menu, choose Modify.
  2. For Cluster identifier, enter a name for your cluster.
  3. For Engine version, choose 5.0.0.
  4. Specify your VPC security group.
  5. In Cluster options section, choose the appropriate default or custom cluster parameter group and choose Continue.
  6. In the Scheduling of modifications section, choose Apply immediately.
  7. Choose Modify cluster to begin the in-place upgrade of your cluster.
    The status of your cluster will now change to Upgrading. When the upgrade is complete, your cluster status changes back to Available and you receive the “Database cluster major version has been upgraded” event. You can track the progress of your upgrade by monitoring the Events page.
  8. When the in-place major version upgrade is complete, perform sanity checks to ensure your upgraded clone is functional and all data and indexes are intact.

Note: Exercise caution to ensure you don’t modify any data on your cloned cluster, because it may potentially result in data inconsistency.

Replicate CDC from your source cluster to the cloned cluster using AWS DMS

This step gets your cloned cluster in sync with your Amazon DocumentDB source cluster by replicating all the database changes since the clone was created. An AWS DMS replication instance performs CDC by connecting and reading data from your Amazon DocumentDB production cluster and writing it your target cloned cluster.

Create an AWS DMS replication instance

To create your replication instance, complete the following steps:

  1. On the AWS DMS console, choose Create replication instance.
  2. Enter a name (for example, docdb40todocdb50) and an optional description.
  3. For Instance class, choose the size based on your needs.
  4. For Engine version, choose 3.5.1.
  5. For Amazon VPC, choose the VPC that houses your source and target Amazon DocumentDB clusters.
  6. For Allocated storage, use the default of 50 GiB. If you have a high write throughput workload, increase this value to match your workload.
  7. For Multi-AZ, choose Yes if you need high availability and failover support.
  8. For Publicly accessible, enable this option.
  9. Choose Create replication instance.

Create AWS DMS source and target endpoints

To create your source endpoint, complete the following steps:

  1. On the AWS DMS console, choose Endpoints in the navigation pane.
  2. Choose Create endpoint.
  3. For Endpoint type, choose Source.
  4. For Endpoint identifier, enter a name that’s easy to remember, for example docdb40-source.
  5. For Source engine, choose Amazon DocumentDB.
  6. For Server name, enter the DNS name of your Amazon DocumentDB production cluster.
  7. For Port, enter the port number of your Amazon DocumentDB production cluster.
  8. For SSL mode, choose verify-full.
  9. For CA certificate, choose Add new CA certificate and complete the following steps:
    1. Download the new CA certificate to create the TLS connections bundle.
    2. For Certificate identifier, enter global-bundle.pem
    3. For Import certificate file, choose Choose file and navigate to the .pem file that you downloaded.
    4. Select and open the file.
    5. Choose Import certificate, then choose global-bundle.pem on the Choose a certificate drop-down menu.
  10. For Username, enter the primary user name of your source cluster.
  11. For Password, enter the primary password of your source cluster.
  12. For Database name, leave blank if you want to replicate CDC from all databases on your Amazon DocumentDB cluster. Alternatively, you can specify select databases using table mappings.
  13. Test your endpoint connection and verify the connection works
  14. Choose Create endpoint.
  15. Create a second endpoint, choose Target for Endpoint type, and provide your cloned cluster details.

Create an AWS DMS migration task

An AWS DMS task binds the replication instance with your source and target instance. When you create a migration task, you specify the source endpoint, target endpoint, replication instance, and any desired migration settings. An AWS DMS task can be created with three migration types: “migrate existing data”, “migrate existing data and replicate ongoing changes”, or “replicate data changes only”.

As you created your target cluster using volume clone that has all data till the start of the cluster creation, you just to need to copy the delta changes happened on source after the cluster creation, so you use “replicate data changes only” migration type of AWS DMS. With this option, AWS DMS replicates the changes from your cluster to the cloned cluster while keeping your source cluster operational. Eventually, the source and target databases will be in sync, allowing for a near zero downtime migration.

Complete the following steps to create your migration task:

  1. On the AWS DMS console, choose Tasks in the navigation pane.
  2. Choose Create task.
  3. For Task name, enter a name (for example, mvu-cdc-task).
  4. For Replication instance, choose the replication instance you created.
  5. For Source endpoint, choose the source endpoint you created.
  6. For Target endpoint, choose the target endpoint you created.
  7. For Migration type, choose Replicate data changes only.
  8. In the Task settings section, select Wizard and select Enable custom CDC start mode.
  9. Specify the start time to be 2 minutes earlier the cloned cluster creation time that you noted earlier.
  10. For Target table preparation mode, select Do nothing.
  11. For LOB column settings, select Limited LOB mode with the default maximum LOB size (32).
  12. Select Turn on CloudWatch logs.
  13. Leave the default settings in Advanced task settings.
  14. For Table mappings, select Wizard.
  15. Under Selection rules, choose EnterSchema for schema, % for source name, % for target name, and include for action.
  16. Choose Create task.

Monitor and verify

AWS DMS now begins replicating CDC from your source Amazon DocumentDB cluster to your target Amazon DocumentDB cluster. The task status should change from Starting to Replication ongoing. You can monitor the progress on the Tasks page on the AWS DMS console. Depending on the changes, it could take several minutes or even hours. You can increase the CDC throughput by adding parallel threads to the DMS task.

Eventually, your source and target will be in sync. You can verify whether they are in sync by running a count() operation on your collections to verify all change events have migrated.

You can also use Amazon DocumentDB DataDiffer tool to perform the data validation.

Change your application endpoint to the 5.0 cluster after the replication has caught up

After you have confirmed that your new Amazon DocumentDB 5.0 cluster is synced and all checks are passed. You’re now ready to change your application’s database connection endpoint from your Amazon DocumentDB cluster to your Amazon DocumentDB 5.0 cluster.

Perform post-upgrade cleanup

You can perform the following steps to cleanup resources.

  1. Delete your Amazon DocumentDB cluster.
  2. Disable the change streams on your Amazon DocumentDB 5.0 production cluster.
  3. Delete any AWS DMS instance, replication tasks, and endpoints as needed.
  4. To add additional instances to your Amazon DocumentDB 5.0 cluster to match up to your Amazon DocumentDB production cluster, select the cloned cluster on the Clusters page of the Amazon DocumentDB console, and on the Actions menu, choose Add instances.
  5. Copy or set up monitoring and alerts that were on the Amazon DocumentDB 4.0 cluster.

Conclusion

In this post, we showed you how to upgrade from Amazon DocumentDB 4.0 to 5.0 with minimal costs and near-zero downtime by using an in-place major version upgrade and AWS DMS.

Amazon DocumentDB 5.0 provides support for vector search, document compression, I/O-optimized storage, faster indexing with index build status, client-side FLE, and more. By performing a major version upgrade of your Amazon DocumentDB clusters from version 3.6 and 4.0 to 5.0, you can unlock multiple enhancements. Review the documentation for more information.


About the Authors

Kunal Agarwal is a Senior Product Manager at Amazon Web Services. Kunal is passionate about data and loves building scalable products to solve customer problems. Prior to joining AWS, Kunal spent 12 years in product management and strategy in the Technology industry.

Anshu Vajpayee is a Senior DocumentDB Specialist Solutions Architect at Amazon Web Services (AWS). He has been helping customers to adopt NoSQL databases and modernize applications leveraging Amazon DocumentDB. Before joining AWS, he worked extensively with relational and NoSQL databases.