AWS News Blog
Category: Amazon SageMaker AI
Maximize accelerator utilization for model development with new Amazon SageMaker HyperPod task governance
Enable priority-based resource allocation, fair-share utilization, and automated task preemption for optimal compute utilization across teams.
Amazon SageMaker HyperPod introduces Amazon EKS support
Amazon SageMaker HyperPod’s integration with Amazon EKS brings resilience, observability, and flexibility to large model training, reducing downtime by up to 40%.
Introducing Amazon Q Developer in SageMaker Studio to streamline ML workflows
Streamline your ML workflows with this generative AI assistant providing tailored guidance, code generation, and error troubleshooting, to build, train, and deploy models efficiently.
Introducing Amazon SageMaker HyperPod, a purpose-built infrastructure for distributed training at scale
Today, we are introducing Amazon SageMaker HyperPod, which helps reducing time to train foundation models (FMs) by providing a purpose-built infrastructure for distributed training at scale. You can now use SageMaker HyperPod to train FMs for weeks or even months while SageMaker actively monitors the cluster health and provides automated node and job resiliency by […]

