AWS HPC Blog

Bridging research and HPC to tackle grand challenges

Today we announced the AWS Impact Computing Project at the Harvard Data Science Initiative (HDSI) to identify potential solutions that can improve the lives of humans, other species, and natural ecosystems. Deb Goldfarb describes its goals and our joint vision.

Support for Instance Allocation Flexibility in AWS ParallelCluster 3.3

AWS ParallelCluster 3.3.0 now lets you define a list of Amazon EC2 instance types for resourcing a compute queue. This gives you more flexibility to optimize the cost and total time to solution of your HPC jobs, especially when capacity is limited or you’re using Spot Instances.

Hyper Metal: Scaling AWS Instances Up with TidalScale

In this post we show you how to scale large-memory, high CPU-count single system-image environments on AWS with HyperMetal-powered instances by TidalScale.

How AWS Batch developed support for Amazon Elastic Kubernetes Service

Today, we discuss AWS batch on Amazon EKS, and the initial motivation and design choices the team made when we developed the service, and some of the challenges to overcome.

Minimize HPC compute costs with all-or-nothing instance launching

In this post, we highlight a little-known configuration option for Slurm on @awscloud ParallelCluster that can reduce costs and increase your iteration speed by preventing idle batch instances from launching when EC2 capacity is limited.

BioContainers are now available in Amazon ECR Public Gallery

Today we are excited to announce that all 9000+ applications provided by the BioContainers community are available within ECR Public Gallery! You don’t need an AWS account to access these images, but having one allows many more pulls to the internet, and unmetered usage within AWS. If you perform any sort of bioinformatics analysis on AWS, you should check it out!

Optimize Protein Folding Costs with OpenFold on AWS Batch

In this post, we describe how to orchestrate protein folding jobs on AWS Batch. We also compare the performance of OpenFold and AlphaFold on a set of public targets. Finally, we will discuss how to optimize your protein folding costs.

Getting the Best Price Performance for Numerical Weather Prediction Workloads on AWS

In this post, we will provide an overview of Numerical Weather Prediction (NWP) workloads, and the AWS HPC-optimized services for it. We’ll test three popular NWP codes: WRF, MPAS, and FV3GFS.

Rearchitecting AWS Batch managed services to leverage AWS Fargate

AWS service teams continuously improve the underlying infrastructure and operations of managed services, and AWS Batch is no exception. The AWS Batch team recently moved most of their job scheduler fleet to a serverless infrastructure model leveraging AWS Fargate. I had a chance to sit with Devendra Chavan, Senior Software Development Engineer on the AWS Batch team, to discuss the move to AWS Fargate and its impact on the Batch managed scheduler service component.

Easing your migration from SGE to Slurm in AWS ParallelCluster 3

This post will help you understand the tools available to ease the stress of migrating your cluster (and your users) from SGE to Slurm, which is necessary since the HPC community is no longer supporting SGE’s open-source codebase.