AWS Partner Network (APN) Blog
Building a Scalable DICOM Ingestion Pipeline for AWS HealthImaging with CitiusTech
By Aditya Kanekar, Sr. Solution Architect, Med Tech – CitiusTech
By Adam Kielski, Sr. Partner Solutions Architect, HCLS – AWS
CitiusTech |
AWS HealthImaging is a new HIPAA-eligible service designed for healthcare and life sciences organizations and their software partners. It enables the storage, analysis, and sharing of large-scale medical imaging data.
AWS HealthImaging reduces costs up to 40% by employing advanced compression and storing a single copy of each image in the cloud. By handling the infrastructure management for imaging workflows, AWS HealthImaging allows healthcare providers to prioritize delivering quality patient care.
Many healthcare providers and clinics are eager to transition their medical imaging workloads to the cloud to benefit from improved accessibility, scalability, and cost efficiency.
However, this migration poses a critical challenge of building a secure, robust, and scalable DICOM (Digital Imaging and Communications in Medicine) ingestion pipeline. This pipeline must ensure incoming image data is transferred securely, in its entirety, while ensuring the imported data is both compliant with the DICOM standard and has not been corrupted with malware.
CitiusTech is an AWS Specialization Partner and AWS Marketplace Seller with the Healthcare Consulting Competency that has developed a DICOM ingestion solution by leveraging AWS HealthImaging and other AWS services. It aims to assist healthcare providers and clinics in optimizing their ingestion processes within the AWS HealthImaging platform.
In this post, we will delve into how this ingestion pipeline can effectively address industry challenges and talk about CitiusTech’s design and implementation approach.
Industry Challenges
- Resource-intensive processes for DICOM data transfer and optimization for storage: DICOM files are typically large in size, containing high-resolution images and associated metadata. Transferring these files across networks requires substantial bandwidth and time, especially in scenarios where large datasets need to be shared or archived. Optimization for storage involves compression techniques, which are computationally intensive and require significant processing power.
.
Further, ensuring data integrity, preserving image quality, and maintaining compliance with DICOM standards impose additional overhead. These factors, combined with the increasing volume of medical imaging data, require significant compute resources that are able to scale according to data volumes and performance requirements. - Securing the DICOM ingestion pipeline: With the increasing sophistication of malware and rising frequency of cyberattacks targeting healthcare systems, ensuring the confidentiality, integrity, and availability of DICOM images has become paramount. Encryption, access control, and virus scanning protocols are essential components of a robust security strategy.
- Rapid, efficient data transfer: Ensuring fast and smooth data transfer in DICOM workflows is a big challenge for the healthcare industry. Hospitals and clinics need to quickly send large medical imaging files between different systems for timely diagnoses and treatments. They have to improve their network bandwidth and use effective, high-performance compression codecs—all while ensuring that dissimilar systems can work together effectively.
Solution Architecture
The following diagram represents CitiusTech’s DICOM ingestion solution architecture which leverages a suite of managed AWS services.
Figure 1 – DICOM ingestion pipeline solution architecture.
- DICOM data upload:
- The solution begins with the uploading of DICOM data through a dedicated application, which uses Amazon API Gateway to facilitate the upload process, with the data stored in an Amazon Simple Storage Service (Amazon S3) staging bucket.
- Amazon Cognito for authentication and authorization:
- The DICOM ingestion endpoint hosted on the API Gateway is integrated with Amazon Cognito user pools, providing authentication and authorization capabilities for users.
- AWS Lambda – Start Import API:
- Each DICOM file is tagged with a unique Import Id for an import session provided by the Start Import API, which is stored as Amazon S3 metadata for reference.
- Amazon S3 – staging bucket:
- Amazon API Gateway exposes an endpoint integrated with S3 to facilitate the DICOM file upload to the staging bucket.
- Amazon Simple Queue Service (SQS) – S3 staging bucket event:
- SQS consumes event from S3 (staging bucket) and triggers a scanning agent when new data arrives in the staging bucket.
- Amazon Elastic Compute Cloud (Amazon EC2) with auto scaling group – scanning agent:
- The scanning agent dynamically scales by employing Amazon EC2 instances managed within an auto scaling group with scale-out controlled by SQS length. Its primary function is to scan the uploaded DICOM images for potential malware. The scanning agent also performs file-based Part10 DICOM validation which checks the first 132 bytes of the File header.
- Amazon S3 object tagging:
- Scanned images are tagged by the scanning agent for further processing with statuses: Malware Detected, Invalid DICOM, or Valid DICOM. Note that DICOM validity is per the DICOM File Format documentation.
- Amazon Simple Notification Service (SNS) – Scan Results:
- Scanning agent also sends out a notification on an SNS topic about the scan results.
- Amazon SNS triggers Lambda function:
- The Copy Clean Images Lambda function is subscribed to the SNS topic Scan Results. The function is triggered to manage the transfer of scanned images from the staging bucket to the DICOM input bucket.
- AWS Lambda function – Copy Clean Images:
- The Lambda function copies the DICOM images into an Import Id folder as per the metadata on the DICOM files.
- Amazon S3 – DICOM input bucket:
- The clean DICOM files are copied into the DICOM input bucket for ingesting into AWS HealthImaging. This ensures only valid DICOM files are sent to AWS HealthImaging.
- Amazon SQS – S3 input bucket trigger:
- As files are moved to the DICOM input bucket, it triggers an SQS event for each uploaded file, indicating their presence and availability to import.
- AWS Lambda function – trigger import job:
- A Lambda function processes these SQS events and triggers a DICOM import job to AWS HealthImaging per Import Id.
- AWS HealthImaging DICOM import:
- AWS HealthImaging DICOM import job imports the DICOM files into the managed service and generates a JSON file, indicating the success or failure of a particular import operation.
- AWS HealthImaging:
- The DICOM files are ingested in the AWS HealthImaging datastore by the HealthImaging Import Job. The DICOM studies are instantly available for viewing and other operations after import.
- Amazon S3 output bucket:
- The output files generated by the import job are stored in Success and Failure folders on the S3 output bucket. The Success folder contains success.ndjson file containing results of all imaging files that imported successfully. Similarly, Failure folder contains failure.ndjson containing results of all imaging files that did not import successfully.
- Example of success.ndjson:
.
{“inputFile”:”dicomInputFolder/1.3.51.5145.5142.20010109.1105620.1.0.1.dcm”,”importResponse”:{“imageSetId”:”12345612345612345678907890789012″}}
.
{“inputFile”:”dicomInputFolder/1.3.51.5145.5142.20010109.1105630.1.0.1.dcm”,”importResponse”:{“imageSetId”:”12345612345612345678917891789012″}} - Example of failure.ndjson:
.
{“inputFile”:”dicom_input/invalidDicomFile1.dcm”,”exception”:{“exceptionType”:”ValidationException”,”message”:”DICOM attribute TransferSyntaxUID does not exist”}}
.
{“inputFile”:”dicom_input/invalidDicomFile2.dcm”,”exception”:{“exceptionType”:”ValidationException”,”message”:”DICOM attributes does not exist”}}
- Example of success.ndjson:
- The output files generated by the import job are stored in Success and Failure folders on the S3 output bucket. The Success folder contains success.ndjson file containing results of all imaging files that imported successfully. Similarly, Failure folder contains failure.ndjson containing results of all imaging files that did not import successfully.
- AWS Lambda S3 trigger:
- The output files trigger a Lambda function to trigger SNS notification on Success and Failure topics on SNS.
- AWS Lambda – trigger images ready notification:
- The Lambda function parses the success.ndjson and failure.ndjson file and publishes an SNS notification with the study details of successful and failed studies.
Imaging viewer user interface (UI):
- AWS HealthImaging viewer UI allows users to view datastores, search ImageSets, view ImageSet Metadata, and view images.
- The application also allows user to search images by PatientID, Accession Number, Study UID, Study Instance UID, and Study Date.
- Imaging viewer UI is an open-source application that is available on GitHub. It can be deployed easily using AWS Amplify.
- For enhanced image viewing experience, the OHIF with AWS HealthImaging extension is available at:
- GitHub – OHIF/Viewers: OHIF zero-footprint DICOM viewer and oncology-specific lesion tracker, plus shared extension packages
- GitHub – RadicalImaging/ohif-aws-healthimaging
This solution seamlessly manages the upload, scanning, transfer, and verification of DICOM data while providing a robust framework for tracking and reporting the status of imports, thus ensuring the integrity and security of medical imaging data.
Solution Highlights
- Low-code: This architecture minimizes custom coding by utilizing native integrations:
- Amazon API Gateway acts as an S3 proxy.
- Amazon API Gateway integrates seamlessly with Amazon Cognito.
- Amazon S3 triggers SQS for event-driven processing.
- Managed HealthImaging DICOM import jobs automates reading, creating image sets, and outputting results into an S3 bucket.
- Secured image upload: Amazon API Gateway securely exposes an S3 bucket as a REST API endpoint, with authentication and authorization handled by Amazon Cognito.
- Data isolation: Images from different tenants are organized into separate folders within S3 staging buckets for initial processing, ensuring pre-ingestion isolation. Clean and validated DICOM images are then moved to the DICOM input bucket for ingestion. The images from different tenants are stored in separate data stores within AWS HealthImaging to maintain data isolation.
- Virus scanning and DICOM validation: Newly-uploaded DICOM images undergo virus/malware scanning and DICOM validation.
- Event-driven architecture: The solution leverages native SQS trigger events with S3 for scanning newly-added objects. Amazon SNS notifies an AWS Lambda function to copy clean images to the input bucket and notify the DICOM image uploader about successful/failed image ingestion.
- Scalable data ingestion: Scalability is achieved using managed services, including an auto scaling group for the scanning agent, AWS Lambda for various tasks (copying images, triggering AWS HealthImaging DICOM import jobs, and handling notifications). AWS HealthImaging is capable of scaling DICOM import jobs horizontally to provide exceptional parallel ingestion performance for large data volumes.
- Automated data tiering AWS HealthImaging uses intelligent tiering for automatic clinical lifecycle management. After import, the image sets start in the Frequent Access Tier and after 30 consecutive days of no access, image sets automatically move to the Archive Instant Access Tier. This translates into significant data storage savings while maintaining the same performance between Frequent Access and Archive Instant Access tiers.
Conclusion
In this post, we showcased how CitiusTech is using AWS HealthImaging to implement a DICOM ingestion pipeline to enable healthcare and life sciences organizations to implement a scalable process for DICOM data management. We have identified a few key challenges and have built a scalable solution on AWS.
A 100% healthcare and life sciences-focused technology services organization since 2005, CitiusTech has been a partner to leading MedTech organizations and developed a deep understanding of the challenges in medical imaging.
For more information on its MedTech offerings, visit the CitiusTech website. You can also learn more about CitiusTech on AWS Marketplace.
CitiusTech – AWS Partner Spotlight
CitiusTech is an AWS Specialization Partner that has developed a DICOM ingestion solution by leveraging AWS HealthImaging and other AWS services. It aims to assist healthcare providers and clinics in optimizing their ingestion processes within the AWS HealthImaging platform.