AWS Storage Blog
Reduce burst render times with Amazon File Cache
In the fast-paced world of movie-making, time is money. Modern films and TV shows can have hundreds, or thousands, of Visual Effects (VFX) shots that need to be created and finalized on tight deadlines. The creation of the final images where geometry, textures, lighting, and other elements are combined for a high-resolution output, a process called “rendering”, requires massive computing power. Studios often find themselves running out of compute capacity on-premises and want to take advantage of the AWS cloud’s scalability to burst their distributed rendering capacity. However, accessing large amounts of data across a bandwidth-limited and high-latency internet connection to on-premises storage can limit the performance of these cloud renders.
Using Amazon File Cache to cache files local to the compute can improve render times and reduce production costs over traditional access methods. Bursting to the cloud with Amazon File Cache benefits the content production workflow by enabling time savings in both rendering and transcoding (a digital-to-digital conversion of one type of encoded data (video or audio) to another). This is accomplished because with File Cache, you can bring the power and scale of AWS compute to your on-premises workloads without the time or expense of making a copy of the data in the cloud.
In this blog, we demonstrate a media rendering use case to show you how we achieved this 10x reduction in burst render time. We will start with the definition of Amazon File Cache, and the challenges that content creators face with the increase in rendering projects. We will then take you through a test environment and demonstrate how File Cache delivered a 10x improvement. In closing, we will summarize the benefits, and provide assets which will help you get started using Amazon File Cache to accelerate your own workloads.
What is Amazon File Cache?
Amazon File Cache provides a high-speed cache on AWS that makes it easier to process file data, regardless of where it’s stored. Amazon File Cache serves as temporary, high-performance storage for data on premises or on AWS. The service allows you to make dispersed datasets available to file-based applications on AWS with a unified view and high speeds. Amazon File Cache enables you to accelerate compute-intensive workloads including VFX rendering, high performance computing, and AI/ML model training.
VFX rendering challenges
When you’re working with Visual Effects and Media workloads, you’ll often use distributed compute clusters known as, “Render farms” to parallelize the creation of final content. Software such as AWS Thinkbox Deadline helps you manage the distribution of jobs to compute nodes in your environment. Many studios run their render workloads completely in the cloud while others have on-premises render farms and take advantage of cloud bursting when they need to scale out their existing compute.
When bursting to the cloud, all the compute nodes must still access the source content which is often stored near the artists in on-premises storage. Many render nodes all reaching back to the on-premises storage simultaneously over a WAN link with limited bandwidth can cause major slowdowns. The high latency of an internet transfer creates further slowdowns. These inefficiencies combined with the fact that you’re copying the same files to every compute node can significantly slow down the render times.
To help mitigate these challenges you can copy the entire dataset for your production to the cloud so the content is local to the compute. Over a 10Gbps AWS Direct Connect a dataset of 10TB takes more than two hours to transfer and you end up copying and storing more files than are needed for processing. Alternatively, you could manually find the relevant files and move them to the cloud, which is an inefficient use of a VFX artist’s time.
Amazon File Cache enables you to cache files on-demand to a file system that is local to the cloud compute, helping you improve your overall performance. To demonstrate this, we simulate an on-premises studio with an NFS server to store source content. We will render several complex animated scenes, bursting the render to AWS in a different region. From this we will measure the amount of time each render takes with and without Amazon File Cache.
The test environment
In this test we used Linux-based EC2 m5.8xlarge instances, managed by Thinkbox Deadline. We deployed 100 render nodes in the US-West -2 (Oregon) region accessing the source files on an NFS file system in US-East-1 (N. Virginia). The project to be rendered is a scene created in Blender that depends on roughly 52GB in referenced texture images. The latency between sites, as measured by a ping is 62.36ms, with the available bandwidth being 1600 MB/s.
Three different jobs were tested: One frame rendered by one render node, 10 frames rendered across 10 render nodes, and 100 frames rendered across 100 render nodes.
The first set of tests was performed with the render jobs referencing files directly on the NFS file system in US-East-1 (N. Virginia). For the second set of tests, we deployed a 7.2TB Amazon File Cache in US-West-2 (Oregon), near the render farm to cache content from the NFS file system in US-East-1 (N. Virginia) and the render farm settings were modified to point only to the cache for source files.
Results
The following table shows the results of the tests performed:
Submitting these jobs to the render farm demonstrated that as you add more instances accessing the storage, you create a bottleneck which affects performance and completion times. With 100 render nodes each rendering one frame, the processing took over three hours to complete as the 52GB of source content was copied from the NFS file system to each node individually.
Amazon File Cache reduced this completion time by over 90%, eliminating the need to copy the source content from the NFS file system more than once. When including consideration for Amazon EC2 and File Cache costs in the 100-node render workload, we showed that Amazon File Cache delivered 91% cost savings, versus running without File Cache, due primarily to the decreased processing time required.
Conclusion
Burst rendering to the cloud provides a great benefit to the VFX and content creation communities by enabling them to take advantage of the cloud’s compute power and scale. However, physical limitations such as high latency and constrained bandwidth make accessing large amounts of data from on-premises storage locations inefficient, limiting the performance of these cloud renders and preventing you from realizing a 1:1 gain from additional compute instances. Amazon File Cache bridges this gap and efficiently brings high performance storage to your compute.
In this blog we demonstrated a media rendering use case to show you how Amazon File Cache can realize faster render completion times. We defined Amazon File Cache, and the challenges that content creators face with the increase in rendering projects, and the need to reduce the time to completion to meet aggressive production schedules. We described a test environment and demonstrated how File Cache delivered a 10x reduction in burst render time. We then concluded with how Amazon File Cache brings Amazon compute and scale to your on-premises media production projects.
To learn more about Amazon File Cache, please see the following resources.