AWS Storage Blog
Akridata accelerates processing of unstructured data with Amazon S3 Express One Zone
Deep learning processes often need to read full datasets, which are usually hundreds of gigabytes in size, before they can perform intelligent data processing. High data retrieval speed and low latency from storage are crucial for enterprises running these performance-critical workloads.
Akridata, an AWS independent software vendor (ISV) partner, helps make artificial intelligence (AI)-assisted unstructured-data exploration a reality by providing large-scale data curation, exploration, and analysis capabilities on unlabeled video and image datasets.
In this post, we describe how Akridata Data Explorer works through an example of how it helps autonomous vehicle creators unlock the true potential of their raw visual datasets. We then discuss how customers storing data in the Amazon S3 Express One Zone storage class, which can improve data access speeds by 10x and reduce request costs by 50% compared to S3 Standard, can reduce processing times by 3.5x when using Akridata Data Explorer.
Akridata Data Explorer: Automatic processing of unlabeled visual data
Akridata Data Explorer is a software as a service (SaaS) solution and is entirely built on AWS because AWS has proven operational experience, vast native services available, and reach to potential customers around the globe. Akridata Data Explorer is a flexible solution that runs on Amazon Elastic Kubernetes Service (Amazon EKS). It provides a no-code environment for end customers to understand large amounts of unstructured image or video data stored in different Amazon S3 object storage classes, without manual labeling.
The following image shows the workflow of Akridata Data Explorer:
- Customers first upload visual datasets into Amazon S3 (in this case, the Amazon S3 Express One Zone storage class).
- Customer’s sign in to Akridata Data Explorer SaaS service.
- Customers create a data processing pipeline by selecting the desired data model from Amazon Elastic Container Registry (Amazon ECR).
- Akridata Data Explorer starts the deep learning process by reading data from Amazon S3 Express One Zone.
- When the data catalog is created, it will be stored in Amazon Aurora.
- Finally, customers can perform visualization operations such as searching and data analytics against their unstructured visual datasets.
Akridata Data Explorer with autonomous vehicle data collection
To demonstrate Akridata Data Explorer, let’s use autonomous vehicle data collection as an example. In autonomous vehicle data collection, hours of videos and images are collected from each test vehicle driving around different countries every day, resulting in large amounts of unstructured data being collected. All of these datasets need to be sanitized, cleaned, and tagged.
Through its various ready-to-use pipelines, Akridata Data Explorer can automatically tag or label a dataset using any foundation or standard model. Inthe following image, you can see the result of automatic tagging by Akridata Data Explorer using the Recognize Anything Model (RAM). Infrastructure tags such as building, parking lot, and road and vehicle tags such as car, jeep, suv, and van are all being tagged automatically.
The following image demonstrates how Akridata Data Explorer can transform the entire dataset into clusters, grouping data based on the similarity of images. This visual approach offers a quick and intuitive way to grasp the big picture within large datasets, while also pinpointing outliers.
The following image illustrates a user’s ability to search for visually similar images. For static image operations, Akridata Data Explorer can search up to 25 million images in a single query. This process results in heavy read operations and relies on storage subsystem performance to complete the workflow in the shortest amount of time. Amazon S3 storage scalability and performance will play a critical role in users’ experience.
On the left are three examples of images with green bounding boxes. Each example shows crosswalks and individuals holding umbrellas. In contrast, there’s a negative case shown with a red bounding box that lacks either of these features.
To find similar examples, select the thumbs up icon for the images with crosswalks and individuals holding umbrellas and then choose Quick Search. The results, shown on the right, show similar areas of interest in diverse scenarios. This feature simplifies the discovery of intriguing patterns and scenarios.
Akridata Data Explorer can also search unlabeled video or image datasets using natural language queries. In the following image, Akridata Data Explorer presents the outcome of a search for “crosswalk with person holding an umbrella.” With the power of text-based search, insights from data are just a few steps away.
Accelerating unstructured data exploration with Amazon S3 Express One Zone
When using Akridata Data Explorer, the storage subsystem used will have a direct impact on how long deep learning, discovery, tagging, and searching will take. With Amazon S3 Express One Zone, the fastest cloud object storage that is not only scalable but delivers exceptional single-digit millisecond latency for data analytic use cases, customers can experience our best performance yet.
Compared with using Amazon S3 Standard, Akridata Data Explorer pipeline execution and processing times are dramatically reduced by an average of 3.5x when storing data on Amazon S3 Express One Zone. This results in much faster data ingestion, preparation for visual data exploration, and discovery. When customers are trying to view the original high-resolution image or video recording, the low latency and high throughput of Amazon S3 Express One Zone can help significantly reduce the data preparation time to provide a much greater user experience. Customers can now analyze more data in less time while reducing operating cost, meaning higher productivity.
Conclusion
In this post, we see Akridata Data Explorer solving data classification problems by applying automatic tagging to datasets in autonomous vehicle data collection and searching for specific content through images. By storing data on Amazon S3 Express One Zone, Akridata Data Explorer customers can reduce the time it takes for data preparation by an average of 3.5x.
To learn more, visit Akridata and Amazon S3 Express One Zone or contact your AWS account manager. Akridata Data Explorer is available in AWS Marketplace.