AWS Storage Blog

Fundrise uses Amazon S3 Express One Zone to accelerate investment data processing

Fundrise is a financial technology company that brings alternative investments directly to individual investors. With more than 2 million users, Fundrise is one of the leading platforms of its kind in the United States. The challenge of providing a smooth, secure, and transparent experience for millions of users is largely unprecedented in the alternative investment industry and requires innovative technology.

A key component of the Fundrise platform is its time-series Quantitative Reporting System, which is responsible for supplying investment performance data to all Fundrise users. Fundrise determined that it was critical for the datastore powering this system to maintain sub-20 millisecond read latency with high availability to ensure a seamless end user experience. By using the high performance Amazon S3 Express One Zone storage class and S3 Batch Operations, Fundrise reduced its system’s total data processing time by 33%, achieved an end-to-end GET latency of 5 milliseconds per request, which was a 95% improvement from its prior database latency, and reduced the total cost of ownership (TCO) of its solution by 25%.

In this blog post, we describe how using S3 Express One Zone and S3 Batch Operations provided Fundrise’s Quantitative Reporting team with a scalable solution for virtually unlimited data ingestion capacity with low latency. First, we look at Fundrise’s evolving requirements for its Quantitative Reporting System. Then, we discuss why Fundrise’s prior database architecture stopped being a viable solution. Finally, we cover the transition to S3 Express One Zone and the improvements it delivered for ingestion and indexing, read latency, and costs. Now, Fundrise has a cost-effective, reliable nightly ingestion system and their users have fast access to the data in their Quantitative Reporting System.

Assessing system requirements

  • Data volume: The Quantitative Reporting System handles a majority of Fundrise’s API traffic and supports numerous critical applications. The underlying dataset grows across three dimensions with the company’s growth: investors, their shareholdings, and the amount of time they have held each holding, resulting in an (O(n^3)) data volume problem. This growth translates to hundreds of billions of data points.
  • Data ingestion and indexing speed: Fundrise’s platform operates with daily data granularity. Each day, new shares are acquired, the shares are priced, dividends are earned, and return on investment is calculated. These events result in changes to the underlying dataset, which must be re-ingested and indexed within a fixed nightly window between close of business and midnight.
  • Data read latency: Investment performance data is a critical dependency for many applications throughout the Fundrise platform. As a result, the Quantitative Reporting System must maintain an average read latency of sub-20 milliseconds to avoid cascading failures of other critical systems. Failure to be fast and accurate has compounding effects across the platform and can disrupt the end user experience.
  • Data remains highly available: It is very important that critical business information is available and resilient so that customers have a seamless experience. Data must reside in multiple places and be consistently accessible.

Initial database replication and network I/O limits

Fundrise initially leveraged a relational database as the backing datastore, as it allowed the company to meet its ingestion and read latency requirements during its early stages. However, as the company grew and the underlying dataset expanded, its initial architecture of importing data from an S3 general purpose bucket into the database became increasingly challenged.

Original architecture diagram using a relational database to deliver the data to Fundrise's Quantitative Reporting System

Fundrise began to overload the available network I/O with its nightly ingestion, which caused replication lag of its read replicas to spike. This in turn caused temporary outages of the Quantitative Reporting System and its downstream dependencies. To solve this temporarily, Fundrise manually slowed down data ingestion and ran an over-provisioned cluster in order to support its nightly process. However, this was not scalable and increased costs, so Fundrise determined it wasn’t a sustainable solution.

Fundrise began to look for a storage solution that could scale with the company’s growing volume of data and still achieve the sub-20 millisecond read latency critical for preserving end-user experience.

Transition to Amazon S3 Express One Zone

The announcement of S3 Express One Zone at re:Invent 2023 came at the perfect time for Fundrise. S3 Express One Zone is a high-performance, single-Availability Zone (AZ) storage class purpose-built to deliver consistent single-digit millisecond first-byte latency for latency-sensitive applications. S3 Express One Zone delivers data access speeds up to 10x faster and request costs up to 50% lower than S3 Standard. Fundrise sought to implement a more tailored solution that combined the benefits of S3’s scalability with custom mechanisms via S3 Batch Operations to achieve the low latency its business required. It was the first solution that met Fundrise’s growing demands for fast object retrieval, unbounded serverless data ingestion, and as a bonus, reduced the TCO of its solution.

New architecture diagram using S3 Express One Zone in multiple Availability Zones to delivery low latency data access for the Quantitative Reporting System

The new architecture allows Fundrise to take data from an S3 general purpose bucket and use S3 Batch Operations with AWS Lambda to export objects into two S3 directory buckets in separate Availability Zones.

Improvements made through S3 Express One Zone and S3 Batch Operations

  • Ingestion and indexing: Fundrise replaced copying data from an S3 general purpose bucket to a relational database with an S3 Batch Operations job that copies objects from an S3 general purpose bucket to S3 directory buckets. S3 directory buckets only allow objects in the S3 Express One Zone storage class and are vital to deliver fast access speeds for users. Since the system’s data access patterns only consisted of unique key lookups, Fundrise began partitioning files by these unique keys, allowing its nightly processing job to remove the indexing of the new data. This reduced total processing time by about 33%.
  • Read latency: With S3 Express One Zone, Fundrise is able to issue GetObject calls with consistent 5 millisecond read latency. The biggest gains here were made on the largest objects, where S3 Express One Zone’s 5 millisecond latency outperformed the 100 millisecond latency Fundrise was able to get out of its relational database system.
  • Achieving high availability: Fundrise has implemented a multi-Availability Zone architecture. At the core of this setup is a primary S3 directory bucket, with a secondary replica bucket configured in a separate Availability Zone. Data is populated to both buckets via the nightly S3 Batch Operations job which utilizes an AWS Lambda function. This configuration allows Fundrise to automatically failover to the secondary bucket should any issues impact the primary. By adding a second bucket within S3 Express One Zone, Fundrise has built a resilient data storage solution that ensures the accessibility of its critical business information.
  • Improving costs: Another benefit of moving to S3 Express One Zone was that it allowed Fundrise to save on costs. Utilizing a database required provisioning an entire compute instance and carried along with it features and associated costs they didn’t need. Instead, storing the data in S3 directory buckets while also reading from them allowed Fundrise to reduce its spend by 25%.

Conclusion

Prior to S3 Express One Zone, Fundrise’s growing data size made it difficult for their relational database to keep up with their increased data ingestion needs while maintaining low read latencies. S3 Express One Zone was released at the perfect time for Fundrise and solved many of the data ingestion and latency concerns they had while also reducing their costs.

By using Amazon S3 Express One Zone, Fundrise is able to provide their Quantitative Reporting System with a scalable, efficient, and cost-effective solution to keep pace with their business growth and maintain the experience their customers expect. It achieved a p50 sub-5 millisecond read latency, a 95% improvement from the previous database latency. It also reduced its total data processing time by 33% and total cost of ownership by 25%. These improvements enabled a reliable nightly ingestion process that is able to handle the growing amounts of data required by Fundrise’s customers. The combination of S3 Express One Zone and S3 Batch Operations was key to streamlining Fundrise’s data ingestion and indexing.

With challenges around rapidly growing data volumes and low read latency requirements, we encourage you to evaluate Amazon S3 Express One Zone as a potential solution to help you achieve the performance and scalability your applications need.

Louie Tambellini

Louie Tambellini

Louie Tambellini is the Vice President of Platform Engineering at Fundrise. He oversees Fundrise’s AWS Organization, Performance Reporting, and Data Engineering functions. He is focused on implementing scalable engineering and data store solutions to support Fundrise’s investor growth.

Matt Krauser

Matt Krauser

Matt Krauser is the Lead Engineer of the Performance Reporting team at Fundrise. His team is responsible for maintaining real time access to investors' performance data throughout Fundrise's growth.

Sam Farber

Sam Farber

Sam Farber is a Solutions Architect at AWS working with FinTech companies. His role involves coming up with practical solutions to problems that Financial Services companies face. He is a former Software Engineer and Technical Trainer with hobbies that include snowboarding, golf, and traveling.

Karthik Akula

Karthik Akula

Karthik Akula is a Sr. Technical Account Manager with Amazon Web Services based in Herndon, Virginia. He helps customers maximize technological investments by integrating advanced solutions with existing business processes. Karthik drives client success and ensures optimal use of technologies. He reduces complexity and enhances efficiency for customers, delivering tangible business value.