AWS HPC Blog
Category: Compute
Diving Deeper into Fair-Share Scheduling in AWS Batch
Today we dive into details of AWS Batch fair share policies and show how they affect job placement. You’ll see the result of different share policies, and hear about practical use cases where you can benefit from fair share job queues in Batch.
Call for participation: RADIUSS Tutorial Series 2023
Lawrence Livermore National Laboratory (LLNL) and AWS are again joining forces to provide a training opportunity for emerging HPC tools and application. In this post you’ll find out the details of those tutorials, and find out how to participate.
Automate your clusters by creating self-documenting HPC with AWS ParallelCluster
Today we’re going to show you how you can automate cluster deployment and create self-documenting infrastructure at the same time, which leads to more repeatable results that are easier to manage (and replicate).
Running protein structure prediction at scale using a web interface for researchers
Today, we’ll show you our open-source sample implementation of a web frontend and cloud HPC backend to support researchers using AI tools like AlphaFold for drug discovery and design.
Instance sizes in the Amazon EC2 Hpc7 family – a different experience
Hpc7g is the first Amazon EC2 HPC instance offering with multiple instance sizes, but this is quite different from the experience of getting smaller instances from other non-HPC instance families. Today, we want to take a moment to explore why this is different, and how it helps.
Application deep-dive into the AWS Graviton3E-based Amazon EC2 Hpc7g instance
In this post we’ll show you application performance and scaling results from Hpc7g, a new instance powered by AWS Graviton3E across a wide range of HPC workloads and disciplines.
How SeatGeek simulates massive load with AWS Batch to prepare for big events
In this post we explore SeatGeek’s load testing system that simulates 50k simultaneous users. Originally built to prep SeatGeek for large-event traffic spikes, it now runs weekly to help them harden their code.
Customize Slurm settings with AWS ParallelCluster 3.6
With AWS ParallelCluster 3.6, you can directly specify Slurm settings in the cluster config file – improving reproducibility and another step towards self-documentation for your HPC infrastructure.
Protein Structure Prediction at Scale using AWS Batch
In this post, we discuss how Novo Nordisk approached the deployment of a scale-out HPC platform for running AlphaFold, while meeting their enterprise IT requirements and keeping the user experience simple.
Streamlining distributed ML workflow orchestration using Covalent with AWS Batch
Complicated multi-step workflows can be challenging to deploy, especially when using a variety of high-compute resources. Covalent is an open-source orchestration tool that streamlines the deployment of distributed workloads on AWS resources. In this post, we outline key concepts in Covalent and develop a machine learning workflow for AWS Batch in just a handful of steps.