Theragen Bio Reduces Turnaround Time for Genomic Analysis by 90% Using Illumina Connected Analytics on AWS
Executive Summary
Theragen Bio worked with lllumina, an AWS Partner, to increase the efficiency of its genomic sequencing and analysis operations. It implemented Illumina Connected Analytics, hosted on Amazon Web Services (AWS), to reduce analysis time from 40 to 4 hours per researcher and to scale laboratory operations with DRAGEN pipelines. Theragen Bio uses Connected Analytics, which runs on Amazon S3, Amazon EC2, and AWS Batch, to streamline analysis of multiomic samples and scale its sequencing to a petabyte of data.
Facing Scalability and Performance Challenges
Theragen Bio is a global genome service provider and artificial intelligence–based biopharmaceutical company. The business specializes in developing personalized anti-cancer drugs that help improve the survival rate of cancer patients.
For more than 10 years, Theragen Bio has been working with AWS Partner Illumina, a developer, manufacturer, and marketer of life science tools and systems for large-scale genetic analysis. Illumina offers a full range of software, instruments, and services that help its customers analyze genomes, make rapid advancements in life sciences research, and improve human health. Theragen Bio relies heavily on Illumina DRAGEN (Dynamic Read Analysis for GENomics) secondary analysis to analyze next-generation sequencing (NGS) multiomic data for whole genomes, exomes, and more. It also uses DRAGEN’s comprehensive suite of workflows with high levels of accuracy for variant calling to scale laboratory operations and efficiently deliver results to its customers.
However, as the business grew, Theragen Bio faced scalability and performance challenges with an on-premises infrastructure that supported Illumina DRAGEN workflows. Chan Hee Park Ph.D., head of NGS platform development department at Theragen Bio, says, “We had issues storing and analyzing large amounts of DNA data because of the storage and compute capacity limitations of our on-premises servers. As a result, a researcher would typically spend 40 hours performing data analysis and transfer, which was too long.” Theragen Bio also needed the ability to scale on demand as necessary. “We knew we had to move from an on-premises environment to the cloud to meet our needs,” Park says.
Even though we’re still in the testing phase, we know the Illumina Connected Analytics platform on AWS will help support our DRAGEN genome analysis needs into the future.”
Chan Hee Park Ph.D.
Head of NGS Platform Development Department, Theragen Bio
Running Cloud-Based Data Workloads with Illumina Connected Analytics on AWS
In late 2022, Theragen Bio began testing Illumina Connected Analytics, a bioinformatics platform running on the AWS Asia Pacific (Seoul) Region. Connected Analytics drives scientific insights by analyzing, processing, and storing genomic data hosted on Amazon Simple Storage Service (Amazon S3) and running on Amazon EC2 F1 instances, which use field-programmable gate arrays (FPGAs) to accelerate analysis. Using Connected Analytics, researchers can operationalize their bioinformatics workflow, build and customize workflows, and seamlessly integrate data with sequencing instruments within a single environment.
After using Connected Analytics to test various bioinformatics workflows including whole-genome sequencing, whole-exome sequencing, and RNA-Seq, Theragen Bio decided to implement the solution in production to streamline data analysis using DRAGEN pipelines. The solution relies on AWS Batch to manage and run hundreds of thousands of compute workloads on AWS. “We chose to use DRAGEN pipelines on Illumina Connected Analytics because it offered faster analysis times and the ability to perform multiple analyses on the same data,” says Park.
Now, by running DRAGEN Germline Pipelines on Connected Analytics, Theragen Bio researchers can access and share workflows, which simplifies collaboration between labs. The company can also create and customize data pipelines using an intuitive user interface and flexible APIs. In addition, Theragen Bio is also furthering its mission of developing breakthrough anti-cancer drugs by achieving faster analysis times and heightened scalability for DRAGEN Germline and multiomic pipelines on Connected Analytics.
Reducing Data Analysis and Transfer Time By 10x Per Researcher
DRAGEN pipelines to Connected Analytics on AWS. “Illumina Connected Analytics on AWS has helped us reduce the run time for our next-generation sequencing of genomes,” Park says. “We’ve seen significant improvements in analysis speed, reducing the time required for data analysis and transfer of multiple deep sequencing samples from 40 hours to 4 hours per researcher. This translates to reduced turnaround time and increased analysis capacity, which leads to greater trust with our customers.” As an example of increased analysis speed and capacity, Theragen Bio can now analyze 400 crop genomes simultaneously, which would have been computationally intensive in an on-premises environment.
The company is also benefiting from the automation capabilities of Illumina Connected Analytics on AWS. “We can customize pipelines and easily monitor and control the analysis process because of automation,” Park says. Furthermore, the platform has given Theragen Bio a competitive advantage: it can provide educational training in addition to analysis functionality for research teams. Says Park, “Even though we’re still in the testing phase, we know the Illumina Connected Analytics platform on AWS will help support our DRAGEN genome analysis needs into the future.”
About Theragen Bio
Theragen Bio, based in South Korea, offers a software platform for next-generation sequencing and analysis. The company provides genome data research to more than 700 domestic medical and research institutes in more than 40 countries.
AWS Services Used
Benefits
- Reduces genome analysis and transfer time from 40 to 4 hours per researcher.
- Facilitates simultaneous analysis of up to 400 crop genomes.
- Benefit 3Customizes pipelines and easily monitors and controls the analysis process through automation.
About AWS Partner Illumina
Illumina is an AWS Partner and a developer, manufacturer, and marketer of life science tools and systems for large-scale genetics analysis. Founded in 1998, Illumina offers a full range of software, instruments, and services that help its customers analyze genomes, make rapid advancements in life sciences research, and improve human health. Illumina’s customers use its genetic-sequencing solutions to accelerate therapeutic and pharmaceutical insights.
Published April 2024