AWS HPC Blog
Category: Compute
Introducing GPU health checks in AWS ParallelCluster 3.6
AWS ParallelCluster 3.6.0 can now detect GPU failures in HPC and AI/ML tasks. Health checks run at the start of Slurm jobs and if they fail, the job is requeued on another instance. This can increase reliability and prevent wasted spend.
Benchmarking the Oxford Nanopore Technologies basecallers on AWS
Oxford Nanopore sequencers enables direct, real-time analysis of long DNA or RNA fragments. They work by monitoring changes to an electrical current as nucleic acids are passed through a protein nanopore. The resulting signal is decoded to provide the specific DNA or RNA sequence by virtue of compute-intensive algorithms called basecallers. This blog post presents the benchmarking results for two of those Oxford Nanopore basecallers — Guppy and Dorado — on AWS. This benchmarking project was conducted in collaboration between G42 Healthcare, Oxford Nanopore Technologies and AWS.
Run Celery workers for compute-intensive tasks with AWS Batch
Many applications leverage distributed task systems like Celery to handle asynchronous work. In this post, we describe how to handle compute-intensive Celery tasks using AWS Batch to scale the compute resources and run worker agents.
Simulating climate risk scenarios for the Amazon Rainforest
In this post, we discuss the “tipping point” problem, using HPC at a large scale to simulate the impact of deforestation to the risk of accelerating damage to the Amazon rainforest.
The benefits of computational chemistry for the circular economy
In this blog post, we’ll explore the benefits of computational chemistry for the circular economy, how it can help reduce waste, and describe the potential for new innovative materials.
Explore costs of AWS Batch jobs run on Amazon EKS using pod labels and Kubecost
Today we show you how to get insights into the costs of running AWS Batch workloads on Amazon EKS using Kubernetes pod labels with Kubecost.
Elastic visualization queues with NICE DCV in AWS ParallelCluster
In this blog post we’ll show you how to create an elastic pool of visualization nodes, by combining AWS ParallelCluster with NICE DCV in a novel way.
Checkpointing HPC applications using the Spot Instance two-minute notification from Amazon EC2
In this post we show you how to create an HPC cluster and capture the two-minute warning notifications from Amazon EC2 Spot to execute a checkpoint, reactively.
Install optimized software with Spack configs for AWS ParallelCluster
Today, we’re announcing the availability of Spack configs for AWS ParallelCluster. You can use these configurations to install optimized HPC applications quickly and easily on your AWS-powered HPC clusters.
Building a 4x faster and more scalable algorithm using AWS Batch for Amazon Logistics
In this post, AWS Professional Services highlights how they helped data scientists from Amazon Logistics rearchitect their algorithm for improving the efficiency of their supply-chain by making better planning decisions. Leveraging best practices for deploying scalable HPC applications on AWS, the teams saw a 4X improvement in run time.