AWS Machine Learning Blog
Category: Amazon SageMaker
Fine-tune Meta Llama 3.1 models using torchtune on Amazon SageMaker
In this post, AWS collaborates with Meta’s PyTorch team to showcase how you can use PyTorch’s torchtune library to fine-tune Meta Llama-like architectures while using a fully-managed environment provided by Amazon SageMaker Training.
Accelerate pre-training of Mistral’s Mathstral model with highly resilient clusters on Amazon SageMaker HyperPod
In this post, we present to you an in-depth guide to starting a continual pre-training job using PyTorch Fully Sharded Data Parallel (FSDP) for Mistral AI’s Mathstral model with SageMaker HyperPod.
CRISPR-Cas9 guide RNA efficiency prediction with efficiently tuned models in Amazon SageMaker
The clustered regularly interspaced short palindromic repeat (CRISPR) technology holds the promise to revolutionize gene editing technologies, which is transformative to the way we understand and treat diseases. This technique is based in a natural mechanism found in bacteria that allows a protein coupled to a single guide RNA (gRNA) strand to locate and make […]
Improve RAG performance using Cohere Rerank
In this post, we show you how to use Cohere Rerank to improve search efficiency and accuracy in Retrieval Augmented Generation (RAG) systems.
Build ultra-low latency multimodal generative AI applications using sticky session routing in Amazon SageMaker
In this post, we explained how the new sticky routing feature in Amazon SageMaker allows you to achieve ultra-low latency and enhance your end-user experience when serving multi-modal models.
Build a RAG-based QnA application using Llama3 models from SageMaker JumpStart
In this post, we provide a step-by-step guide for creating an enterprise ready RAG application such as a question answering bot. We use the Llama3-8B FM for text generation and the BGE Large EN v1.5 text embedding model for generating embeddings from Amazon SageMaker JumpStart.
Best prompting practices for using Meta Llama 3 with Amazon SageMaker JumpStart
In this post, we dive into the best practices and techniques for prompting Meta Llama 3 using Amazon SageMaker JumpStart to generate high-quality, relevant outputs. We discuss how to use system prompts and few-shot examples, and how to optimize inference parameters, so you can get the most out of Meta Llama 3.
Enabling production-grade generative AI: New capabilities lower costs, streamline production, and boost security
As generative AI moves from proofs of concept (POCs) to production, we’re seeing a massive shift in how businesses and consumers interact with data, information—and each other. In what we consider “Act 1” of the generative AI story, we saw previously unimaginable amounts of data and compute create models that showcase the power of generative […]
Scaling Thomson Reuters’ language model research with Amazon SageMaker HyperPod
In this post, we explore the journey that Thomson Reuters took to enable cutting-edge research in training domain-adapted large language models (LLMs) using Amazon SageMaker HyperPod, an Amazon Web Services (AWS) feature focused on providing purpose-built infrastructure for distributed training at scale.
Introducing Amazon EKS support in Amazon SageMaker HyperPod
This post is designed for Kubernetes cluster administrators and ML scientists, providing an overview of the key features that SageMaker HyperPod introduces to facilitate large-scale model training on an EKS cluster.