AWS Database Blog
Empirically test and measure queries from Amazon DynamoDB
This post describes how our team inexpensively provided simple output for the empirical analysis of Amazon DynamoDB queries. Our goal was to take a legacy DynamoDB database, transform the data in a novel way, and then store it in a new DynamoDB database. After eight months of hard work on our project, we decided to stop reasoning about our system theoretically, and to start empirically testing and measuring it instead.
We had proven correctness for a myriad of legacy use cases, but the system’s performance was ultimately unacceptable because it was expensive and slow. Many legacy services sat on top of our work. We felt stuck and exhausted from reading code and trying to follow data through the many legacy services, so we decided to approach our project in a different way.
The empirical mindset
You may be used to the development cycle of writing code, running it, and iterating on it until you get a desired result. If the answer is wrong, you go back to the code, load some data in your mental memory, and trace the execution. If you’re especially stymied, you add a few println
statements to track the state of the system.
On a small scale, this theoretical approach is manageable. On a large scale, it can be nearly impossible to reason about data across several systems. That’s why you should abandon the theoretical for the empirical for large-scale cases.
Our team wrote some inexpensive and simple code using RequestMetricCollector to log standardized DynamoDB queries. Standardizing queries means creating a form of the query as a string without the data that makes it unique—stripping out all data until only the fields, table name, and index names remain.
The RequestMetricCollector
action can be used to intercept calls to AWS with the AWS SDK. This approach should work for other AWS services as well. It exposes a collectMetrics
function that has generic request and response objects. The documentation focuses on the possibilities for metrics, but non-metric data is available as well. It’s low cost because it doesn’t need to integrate with any other services or systems.
All the work is done in memory. It’s natural because the output is a list of standardized queries that can be read by a human. There is no need to put the query data into another data store for analysis.
The diagram below starts with a complex architecture composed of many services that are inherently difficult to reason about. One of the components calls out to DynamoDB to persist data. Intercept these calls with RequestMetricsCollector
and log information about each request, which helps you solve some of the problems listed in the next section.
ALT Text: A workflow diagram illustrating how Service 1 interacts with Service 2, 3, 4, etc., and how those services interact with Service N, which connects with the DynamoDB table. The DynamoDB table interacts with the AWS SDK (RequestMetricsCollector
), which in turn logs on your host/server.
Complex architecture:
Diagram index
- A complex architecture with many interconnected services
- Implementing collectMetrics(request, response)
- Logs on your host/server, for example:
PerformanceMetrics: @@ 4 @@ {TableName: LogTest,Key: {uuid={S: 73bed7db-55d8-4ecc-b516-ddb0672bc31a,}},} @@ {TableName: LogTest,Key: {uuid={S:,}},}
Common use cases for empirical testing
The empirical testing method helps solve the following types of problems:
- Performance analysis: Decisions to tune your DynamoDB tables for one type of query means that other types of queries are not as efficient. You must make performance tradeoffs. How can you know the reasons for poor performance? What specific query parameters are causing queries to run more slowly than you expect? An empirical approach helps answer these questions.
- Refactoring of legacy systems that sit above your DynamoDB database: The word “legacy” usually implies a system that you, as an engineer, are not familiar with. It’s hard enough debugging your own code, much less code written by several other engineers. Working backwards from your reads and writes to DynamoDB from the original request can be daunting.For our project, our team found that we weren’t able to successfully connect legacy input to our system to the reads and writes happening at the DynamoDB level. Not only did our empirical approach help us work with a smaller problem space (much smaller than the theoretical), but it gave us the exact shape of the problem.
- Root cause analysis for data-related bugs: When you look for the root cause of bugs, the explanation often appears when you observe bad, or wrong, queries in your logs. Suddenly, the solution becomes clear and the bug can be quickly fixed. The Example code section generates logs with standardized queries and an example query at the DynamoDB layer to help you find the root cause for your bug. The code helps to remove the temptation of guesswork from only reading code. The other benefit of the approach is that the code can be injected temporarily, and then removed quickly without impact.
Example code
In this section, I look at three Java classes that enable you to quickly log DynamoDB request metadata. Two classes are meant to be integrated into your code – DDBQueryLogger.java and DDBQueryInfo.java. The third class is an example AWS Lambda function written in Java. I include information about how to re-create this test in your own AWS account.
After you have created the classes in your code base, how do you wire it up? Locate the code (or configuration) that constructs your DynamoDB client and add the DDBQueryLogger metric collector to that code or configuration. The LogTest.java class shows an example by using the “withMetricsCollector()
” function. When you are ready to disable logging, you can remove this line.
Now, consider how the two classes do their work. You are intentionally trading performance and accuracy for quick insights.
One example is the regular expressions that you are using. They are computationally expensive, but easy to work with and understand. Other concerns, like the thread safety of the in-memory data structure, are given a “best effort” approach to yield quick insights.
The following two classes go through the following steps:
- Log every request to DynamoDB from your system.
- Standardize the textual representation of the query.
- Add the result to an in-memory data structure.
- Flush the in-memory data structure to your logs every five minutes.
Further analysis of the output can be done offline.
DDBQueryLogger.java:
DDBQueryInfo.java:
LogTest.java:
I used Gradle to package up the three classes in this post to run against a DynamoDB table called LogTest. The table had one field named uuid, with the hardcoded value from the sample code in it. For information about how to package up a Lambda function written in Java with Gradle, see Creating a ZIP Deployment Package for a Java Function.
Here is the build.gradle file that I used:
Code explanation
The mission of the code is to easily integrate with an existing system (a one-line change to the DynamoDB client object), perform some measurements, and then easily come out. You don’t want to introduce any new dependencies in the system under test. Here is a list of the design decisions made:
- Store all DynamoDB queries in memory. You have checks to prevent it from growing indefinitely. (DDBQueryLogger.java, lines 47–50).
- Standardize queries to DynamoDB. This allows you to group classes of queries that makes the memory footprint small, and make analysis tractable. (DDBQueryLogger.java, lines 60–64).
- Flush the data structure to logs periodically. It makes collecting data a bit more painful, but prevents additional dependencies and complexity.
- Keep an example for every standardized query. You might be able to re-use the data you see to dig in on a specific use case.
Understanding the output
The output of the DDBQueryInfo
class is arbitrary. It reflects my personal preference for quickly working with logs. You are encouraged to change the output to suit your needs (DDBQueryInfo.java, line 29).
Here’s an example from running the LogTest Lambda function:
By using the @@ string as a delimiter, ignoring the PerformanceMetrics
: prefix, you can see that there are three fields in this record:
- The number of times you’ve seen the standardized version of the DynamoDB query.
- An example of the standardized query string, which has the unique values of the query removed. I’ve found this useful when tracing data through the code.
- The standardized query.
With these fields, you can begin to answer questions about how our code is calling DynamoDB. It is especially useful for considering what type of indexes you might want to add to your table, or why certain use cases run slower than you’d like.
The code in this post does not attempt to diagnose slow queries directly. Consider modifying DDBQueryLogger.java at line 52 to include other data like IOP consumption.
Though only the query requests to DynamoDB are logged, you can see which fields in “hot” or frequently called queries might benefit from being moved into an indexed field.
Conclusion
This empirical approach is a useful way to analyze DynamoDB queries. The code in this post should be adaptable to your specific problem and use case.
For example, you might extend the code to also track consumed capacity at a finer level of detail. You might also change the output format (DDBQueryInfo.java, line 29) to something you’re more comfortable with. Hopefully, this post has helped you solve some of your problems so you can keep using DynamoDB to delight your customers.
About the Author
Ryan Meyer is a Senior Engineer with Amazon Digital Goods.