AWS Database Blog

Enhancing performance of Amazon RDS for Oracle with NVMe SSD hosted Smart Flash Cache and Temporary Tablespaces

Amazon Relational Database Service for Oracle (RDS for Oracle) supports the instance store for temporary tablespaces and the Database Smart Flash Cache. An instance store for RDS for Oracle is based on Non-Volatile Memory Express (NVMe) SSD devices that are physically attached to the host computer. The storage is optimized for low latency, random I/O performance, and sequential read throughput. At the time of writing, Amazon RDS for Oracle provides the following instance types with instance store: m5d, r5d, x2idn, and x2iedn. Refer to the RDS for Oracle documentation for the most up to date list of supported instances.

In this post, we discuss temporary tablespace and Flash Cache features with local NVMe SSD-based instance storage, configuration options, typical use cases, and feature availability by engine and storage configuration. We dive deep into the tiered cache capability and how it can improve the query performance of latency-sensitive workloads. We also provide an overview of the temporary object capability.

Benefits of placing temporary tablespaces on instance storage

Oracle Database uses temporary tablespaces to store intermediate query results that don’t fit in memory. Larger queries can generate large amounts of intermediate data that needs to be cached temporarily but doesn’t need to persist. In particular, a temporary tablespace is useful for sorts, hash aggregations, joins, and as storage for global temporary tables. Placement of temporary tablespace in an instance store is supported on all editions of Oracle running on RDS for Oracle instance. The following diagram illustrates this architecture.

Benefits of placing Flash Cache on instance storage

During the conventional path, Oracle Database uses System Global Area (SGA) memory segments to store blocks requested through read requests as well as blocks that have been requested for modifications. Flash Cache allows Oracle Database to expand this memory to a second-level cache on local NVMe SSD-based storage. When dirty buffers are evicted from SGA, they are first written to the Amazon Elastic Block Store (Amazon EBS) storage and then are written to Flash Cache. If the blocks are unmodified and need to be evicted from SGA, they are directly written to Flash Cache. So when a session requests blocks that aren’t in SGA, they are quickly fetched from Flash Cache instead of going to the Amazon EBS layer. The speed of this fetch is 75% faster than the speed of fetch from Amazon EBS, as shown in the results of our tests. If your RDS for Oracle DB instance uses Enterprise Edition, you can place Flash Cache in an instance store. There might be other ways in which you can augment the memory for an RDS for Oracle database instance like using Amazon ElastiCache, which is compatible with Valkey, Redis, and Memcached engines. However, in order to use ElastiCache, there might be changes required in the application, because using ElastiCache is a cache aside mechanism in comparison to Flash Cache, which is cache through. Flash cache is a simpler and cost effective way to improve the performance of workloads with higher demand on memory. The following diagram illustrates this architecture.

Configure local NVMe SSD storage for a temporary tablespace

By placing your temporary tablespace on the instance store, you deliver an immediate performance boost to queries that use temporary space. This significantly speeds up queries that spill to temp if they are temp I/O bound. Placing a temporary tablespace on local NVMe SSD storage can also result in reduced read latency, higher throughput, and a need to provision fewer IOPS. This could potentially reduce the number of IOPS that need to be provisioned for EBS storage and might also fit your workload on an instance with lower IOPS and throughput limits. This also frees up space on the RDS for Oracle instance storage that was previously allocated to temp files. Also, the temp files will not be included in the high-water mark of RDS for Oracle instance storage consumption and won’t be included in RDS for Oracle instance backups and snapshots.

By default, 100% of instance store space is allocated to the temporary tablespace. To customize temporary tablespace storage, you can modify the parameter rds.instance_store_temp_size in the custom parameter group associated with your RDS for Oracle instance. If you do not already have one, create a custom parameter group and associate it with your RDS for Oracle instance. You can configure the parameter rds.instance_store_temp_size as one of the values in {DBInstanceStore*{0,2,4,6,8,10}/10} as per your choice. If you modify this parameter, make sure to reboot the instance so that the changes can take effect. For example, if you want to configure 20% of your instance store as temporary tablespace, you need to configure rds.instance_store_temp_size as {DBInstanceStore*2/10}. You can independently configure all or part of the remaining 80% instance store space as Flash Cache. In addition to configuring the parameter, you have to also create a temporary tablespace and designate it as the default temporary tablespace. To create a temporary tablespace on the instance store, use the Amazon RDS procedure rdsadmin.rdsadmin_util.create_inst_store_tmp_tblspace. To set the default temporary tablespace, use the Amazon RDS procedure rdsadmin.rdsadmin_util.alter_default_temp_tablespace. You can configure a temporary tablespace group to include temporary tablespaces from both the instance store and Amazon EBS when you need more temporary storage than is allowed by the maximum setting of rds.instance_store_temp_size.

Configure local NVMe storage for Flash Cache

Flash Cache improves the performance of single-block random reads in the conventional path. A best practice is to size the cache to accommodate most of your active dataset. Oracle recommends a cache of 4–10 times the size of your SGA. If you have a database workload that is suffering throughput saturation during peak periods or you want to support a higher number of transactions per second (TPS), you can use one of the supported instance families to configure Flash Cache. If you’re running a larger RDS for Oracle instance due to a higher throughput requirement for your database, using Flash Cache might cut down your throughput requirements, enabling you to scale down to a smaller instance and resulting in cost optimization.

By default, 100% of instance store space is allocated to temporary tablespace. To configure the instance store to allocate space to Flash Cache, modify the db_flash_cache_size parameter. You can configure Flash Cache as a percentage of your total instance store size available on your instance. You need to configure db_flash_cache_size as {DBInstanceStore*{0,2,4,6,8,10}/10}. By default, the parameter is configured as {DBInstanceStore*0/10}. For example, if you want to configure 20% of your instance store size as Flash Cache, you need to configure db_flash_cache_size as {DBInstanceStore*2/10}.

Considerations for using local NVMe SSD storage for temporary tablespaces

SSDs have shown promising results when it comes to read I/O. You would typically see a considerable performance gain from using local storage for temporary tablespaces if your workload shows a higher number of waits on direct path read temp as compared to direct path write temp. Similarly, using local storage for your temporary tablespaces usually benefits large hash joins, sorts, and aggregations, which involves more reads than writes.

We ran a test with an RDS for Oracle m5d.2xlarge database instance using a simple create table statement by using sorts, joins, and aggregations. The SQL took 59 minutes to run.

We further analyzed the Amazon RDS Performance Insights counters for Amazon RDS for Oracle and observed the following values for sorts:

  • Sorts (memory): 18,629/sec
  • Sorts (disk): 75,432/sec

We can see that the sorts on disk were much higher compared to that in memory. This indicates a heavy use of temporary tablespace as compared to Program Global Area (PGA).

As shown in the following table, direct path read temp waits are dominant, whereas direct path writes are insignificant compared to the reads from the temporary table space. Other foreground wait events have been removed for simplicity.

Foreground Wait Events
Event Waits Total Wait Time (Seconds) Average Wait % DB Time Wait Class
direct path read temp 215,265 24,725 948.75ms 41.74 User I/O
direct path write temp 968 898 112.45ms 7.76 Concurrency

We ran the test again after creating a temporary tablespace on local SSD storage and changing the default temporary tablespace to this new tablespace.

Foreground Wait Events
Event Waits Total Wait Time (Seconds) Average Wait % DB Time Wait Class
direct path read temp 9,572 15,725 25.4ms 10.49 User I/O
direct path write temp 968 898 30.76ms 7.76 User I/O

When we reviewed the foreground waits events for this run, we observed that the average wait time for direct path read temp was drastically reduced.

We also reviewed the SQL we observed earlier and saw that the runtime was reduced by 61% with the SQL execution plan remaining the same. Additionally, the create table statement now took 23 minutes.

Considerations for using local NVMe SSD storage for Flash Cache

A performance issue on Oracle Database might manifest itself as a reduction in TPS or bandwidth saturation during a peak load period. While using Amazon RDS Performance Insights for your RDS for Oracle database instance to list DB load sliced by waits, if you observe DB file sequential reads as the top wait event, consider looking at the read IOPS and average queue length in Performance Insights and the read latency Amazon CloudWatch metric. Higher values of these metrics indicate more blocks being scanned from the EBS volume into the cache. Such workloads are suitable candidates which benefit from using flash cache. You can then navigate to the top SQL section to identify the top average active sessions (AAS) waiting due to a particular wait event. Sometimes non-selective indexes and full table scans render more rows being fetched into the cache only to be discarded (filtered) to obtain the result set. This results in unwanted read I/O and increased response times of the query.

The other area to look at is the dynamic performance view (v$DB_CACHE_ADVICE) or buffer advisory section of your AWR report (if you hold a diagnostic and tuning license pack). You might find recommendations to increase the size of the buffer cache. Even the ADDM reports, if pulled for the period of analysis, could indicate the same.

If all the preceding indicators are true, the database buffer cache is unable to efficiently retain the buffers your workload requires more frequently.

Oracle’s algorithms for managing the buffers in the buffer cache tend to segregate each buffer into hot and cold areas, depending on the frequency with which each data block is read.

Each time a data block is re-referenced, it moves to the head of the data block chain on the hot side of the data buffer. Sometimes, the buffer cache ages out buffers faster than it should, inadvertently discarding buffers that are frequently accessed (hot buffers). This may be due to unusually high volumes of buffers being loaded into the cache. If the buffer cache is unable to accommodate the incoming buffers, it is forced to discard the hot buffers. Consequently, a query that reads those hot buffers several times per second has to fetch the buffer from the disk into the cache for every call. Imagine this happening millions of times per second! This would bring your database to a grinding halt. Increasing the buffer cache is not always the right solution.

There are two other ways to solve this problem. One is to tune the queries causing high buffer gets and therefore reduce the overload of the buffer cache. This approach might be cumbersome or you might not be able to make changes to your query. The other option is to use the instance store to create an additional Flash Cache to store these hot buffers. This way, the database doesn’t need to scan the blocks from the EBS volumes. This can help reduce I/O costs and improve response times.

We ran a test with the Swingbench OrderEntry benchmark, which is a “TPC-C like” benchmark workload, on an RDS for Oracle m5d.2xlarge database instance for 10 minutes. The test was run twice, once before Flash Cache was enabled and again after Flash Cache was enabled.

The following figure shows the workload profiles and runtimes of the online transaction processing (OLTP) queries before Flash Cache was enabled. Here we can see large waits on “db file parallel read (in purple) and db file sequential read (in blue).

The following figure shows the workload profiles and runtimes of the OLTP queries after Flash Cache was enabled on the RDS for Oracle instance.

Here we can see “db flash cache single block physical read” and “db flash cache multiblock physical read” as major wait events, replacing sequential and parallel reads.

We also captured the statistics for the query with the highest amount of elapsed time per run before flash cache was enabled which is shown below:

SQL Statement

 

 

Runs/Sec

Elapsed Time (Sec)/Run Rows Processed/Run

 

Buffer Gets/Run

 

DbLoadNonCPU(Max)

SELECT TT. ORDER_TOTAL, TT. SALES_REP_ID, TT. ORDER_DATE, CUSTOMERS. CUST_FIRST_NAME, CUSTOMERS. CUST_LAST_NAME … 1.01 107 4133.80 117341 210

The SQL by waits view shows that the query could run only once per second with a runtime of about 107 seconds per run. We can also see buffer gets for each run around 117K only to fetch 4133 rows.

The following table shows statistics for the same query after flash cache was enabled.

SQL Statement Runs/Sec Elapsed Time (Sec)/Run Rows Processed/Run Buffer Gets/Run DbLoadNonCPU(Max)
SELECT TT. ORDER_TOTAL, TT. SALES_REP_ID, TT. ORDER_DATE, CUSTOMERS. CUST_FIRST_NAME, CUSTOMERS. CUST_LAST_NAME … 17.98 5.5 3581.94 79519 115

We can see that the runtime was reduced from 107 seconds to 5.5 seconds, resulting in a 94.79% improvement in query performance. There have been around 38000 less buffer gets with flash cache enabled while the DB LoadNonCPU significantly reduced by 45%.

Monitoring Smart Flash Cache

If you have been using flash cache and would want to know how many blocks of database objects like table or an index are there in flash as compared to buffer cache, you can do that with dynamic performance view, v$bh . You can count the number of buffers for each object in the main buffer cache and the flash cache using the STATUS values such as ‘flashcur’.

SELECT owner || '.' || object_name object, 
SUM (CASE WHEN b.status LIKE 'flash%' THEN 1 END) flash_blocks, 
SUM (CASE WHEN b.status LIKE 'flash%' THEN 0 else 1 END) cache_blocks, 
count(*) total_blocks FROM v$bh b JOIN dba_objects ON (objd = object_id) 
GROUP BY owner, object_name order by 4 desc;

You would want to run some benchmarking on your non production environments before you would enable flash cache in your production. For a quick comparison, you can use the dynamic performance view, v$sysstat before and after enabling flash cache which shows the number of blocks added to the cache and the number of “hits” that were satisfied from this cache.

SELECT name,value FROM v$sysstat
where name in ('physical read flash cache hits', 'physical reads','consistent gets','db block gets', 'flash cache inserts');

Conclusion

In this post, we discussed that by using RDS for Oracle instances with NVMe SSD storage, you can take advantage of the fast local storage for temporary tablespaces and Flash Cache. This functionality is a cost-effective way to provide higher throughput for latency-sensitive workloads and acts as an alternative to using large RDS for Oracle instances where there are higher resource demands from the workload. For pricing details please review the RDS for Oracle pricing documentation. You can start using the instance store feature for RDS for Oracle by visiting the Amazon RDS console and spinning up a supported RDS for Oracle instance. For more information refer to, RDS for Oracle user guide.


About the Authors

Ravi Kiran is a Senior Database Specialist at Amazon Web Services. He works with enterprise customers to provide technical assistance on database operational performance, helping them optimize workloads on AWS using database best practices. He specializes in open source database engines on Amazon RDS and Amazon Aurora as well as Amazon RDS for Oracle and Amazon DynamoDB.

Nipun Ravi is a Database Specialist Solutions Architect with Amazon Web Services. He works with the customers in the Canadian public sector, helping them architect, design, and optimize database workloads on AWS.

Manash Kalita is a Senior Database Specialist Solutions Architect with Amazon Web Services. He works with AWS customers designing customer solutions on database projects, helping them migrate and modernize their existing databases to the AWS Cloud as well as orchestrate large-scale migrations in AWS.