AWS Public Sector Blog

4 best practices to enhance research IT operations with AWS

AWS branded background design with text overlay that says "4 best practices to enhance research IT operations with AWS"

Academic research IT departments around the world face the same challenge: how to balance their existing on-premises infrastructure with the opportunities of cloud computing. At the Supercomputing 2024 (SC24) conference, Amazon Web Services (AWS) hosted a panel featuring two research IT leaders: Circe Tsui, associate director of solutions architecture at Emory University in the Office of Information Technology, and Dr. Robert Shen, director of the RMIT AWS Supercomputing Hub (RACE) at the Royal Melbourne Institute of Technology (RMIT).

During the panel, Tsui and Shen shared how their institutions use AWS to augment and enhance their research operations with more scalability, security, and collaboration alongside their on-premises infrastructure. Read this post to learn their best practices for creating an effective hybrid cloud approach for academic research institutions.

Benefits to augmenting on-premises HPC with the cloud

Leading academic research institutions are finding that integrating AWS with their existing on-premises high performance computing (HPC) infrastructure helps them maximize these investments and unlock new capabilities, such as:

  • Flexible workflows: The cloud adds flexibility to scale operations seamlessly between on-premises and cloud environments, allowing researchers to quickly provision the most appropriate resource for their tasks. In this model, as Dr. Shen noted, some computing tasks can start with a laptop or on an iPad, and then move into the cloud, while maintaining the ability to switch back for tasks best suited to local infrastructure.
  • Scalability with speed: Cloud-powered HPC also offers speed and scalability that many on-premises HPC servers can’t match. Tsui shared an example about one researcher at Emory who wanted to quickly analyze 22 genomic sequence samples. The researcher could only process one sample at a time with the on-premises HPC cluster, so they asked to be set up with AWS. Even with minimal AWS experience, the researcher analyzed the sequences simultaneously in the cloud and finished their analysis in two to three hours, an analysis that used to take three days.
  • Secure global collaboration: The cloud helps research teams share data and discoveries securely and compliantly with ease. Before using AWS, Tsui used to tell researchers to use a network attached storage (NAS) device and run multiple file transfer protocols (FTPs) if they needed to share data. Now, she tells researchers to use Amazon Simple Storage Service (Amazon S3) buckets to make data accessible worldwide and enhance researcher collaboration.
  • Access new results with advanced computing: Both leaders emphasized the importance of cloud enabling access to cutting-edge technologies. AWS investments in hardware like AWS Trainium chips, purpose-built for the next generation of generative artificial intelligence (AI) workloads, help researchers overcome fixed infrastructure limitations and open up new avenues for scientific exploration and advancement.

By augmenting their on-premises environment with AWS Cloud, institutions can offer their research communities the best of both worlds: stability and scalability with security.

Best practices from research IT leaders

Achieving an effective and sustainable hybrid cloud model doesn’t happen overnight. Even cloud champions like Emory and RMIT needed a strategy to navigate common challenges. Tsui and Shen offered the following advice for how institutions can shape their cloud journey.

1. Facilitate culture change

The biggest challenge faced when introducing cloud technology into an established research institution is the people. Institutions need to invest in change management to see results.

One effective strategy is showcasing cloud success stories from other researchers. Dr. Shen counteracted cloud resistance at RMIT by demonstrating how one research group achieved 100 times faster outcomes using AWS.

RMIT is also guiding researchers to the most appropriate infrastructure by cataloging use cases to clarify which workloads are better suited for on-premises or cloud resources. Pre-built solutions are another way to reduce friction in cloud adoption. Emory’s Sloan Lab, integrated with AWS, lets researchers run scripts without learning the AWS console. Similarly, RMIT offers pre-made images in the cloud for material science and other domains to help accelerate research.

Bringing in external experts through the AWS Partner Network supports a researcher’s journey into the cloud. AWS Partners offer several services directed to research workloads, such as simple interfaces that help researchers create and control AWS computing resources, set and monitor budgets, and forecast spend; secure, performant, and scalable research environments; and open source HPC resource provisioning.

These strategies meet researchers where they are, offering targeted pathways to explore, test, and adopt cloud solutions that enhance research outcomes.

2. Rethink funding models

Cloud computing shifts research budgets from capital expenses (CapEx) to operational expenses (OpEx). CapEx involves upfront investments in infrastructure, while OpEx allows researchers to scale resources and pay only for what they use. However, this flexibility requires careful cost management.

Tsui recommends using tools like AWS Cost Explorer, which lets users visualize and manage AWS costs and usage over time, and AWS Budgets to estimate, track, and monitor research costs. By tagging and tracking specific server instances, research teams can also see which analyses consume more resources and look for more efficient solutions as necessary.

Building budget alerts and alarms can also help control costs in the cloud. Dr. Shen sets up a budget in advance for researchers using RACE and, with services like Amazon CloudWatch, sends researchers alerts when they’ve used a certain percentage of their budget to mitigate budget overruns.

Optimizing resources in the cloud is another way to rethink typical research costs. Dr. Shen recommends researchers make sure their code supports the efficient utilization of CPUs and GPUs. And, when research projects are not actively using cloud resources, those servers can be set to automatically shut down to avoid unnecessary costs.

3. Enable secure collaboration

Today, research teams extend far beyond campus boundaries, crossing institutional and international borders. The cloud helps research teams share data and discoveries securely and compliantly with ease.

The cloud simplifies security with built-in guardrails like encryption, firewalls, and continuous monitoring tools. Tsui shared how Emory uses a rigorous firewall managed by AWS Transit Gateway, Amazon GuardDuty, and AWS CloudTrail to protect and monitor its cloud environment, among other security services.

AWS Control Tower streamlines multi-account setups to onboard multiple researchers to a cloud platform quickly—in compliance with necessary regulations for every account in the environment. However, researchers must still follow best practices to maintain data security, such as avoiding identifiable names for storage buckets. Shared responsibility between IT teams and researchers keeps data protected in the cloud.

4. Empower researchers with training tools

Equipping researchers with the skills to use cloud resources effectively helps researchers blend cloud-powered and on-premises HPC more independently. Both Emory and RMIT offer frequent hands-on labs and a library of training material. Emory’s monthly introductory sessions provide an overview of cloud concepts, shared responsibilities, and compliance requirements, while RMIT runs targeted training programs tailored to research needs. Both Emory and RMIT regularly contact their cloud-enabled researchers to help them overcome roadblocks and keep research on track.

Many institutions, however, do not have the resources to support training and upskilling across all research teams. To help researchers build cloud skills, AWS recently launched new training resources to help academic researchers ramp up in the cloud. These researcher-centered learning paths build practical cloud skills and expertise across specific focus areas that align with their objectives. Covering HPC, quantum, statistics, AI and machine learning (ML), generative AI, and more, these training offerings bring researchers up to speed on what’s possible in the cloud and how to get there.

Enhance your research journey with the cloud

When using the cloud for research, the mindset shouldn’t be: How can I replicate my research workloads on the on-premises infrastructure? The mindset should be: What outcomes am I trying to achieve? Effectively integrating cloud computing with on-premises HPC can help researchers stay agile, flexible, and innovative as they work to uncover breakthroughs.

AWS is committed to building products and relationships that facilitate innovation and scientific discovery. Explore the new academic research training resources to get started with the cloud to support your research needs.

Learn more about how institutions turn to AWS to help them develop creative and scalable ways to address the cloud skills gap through upskilling and traditional academic settings on the AWS Research Computing webpage.

Read related stories: