AWS Machine Learning Blog
Category: Amazon SageMaker Studio
Customize small language models on AWS with automotive terminology
In this post, we guide you through the phases of customizing SLMs on AWS, with a specific focus on automotive terminology for diagnostics as a Q&A task. We begin with the data analysis phase and progress through the end-to-end process, covering fine-tuning, deployment, and evaluation. We compare a customized SLM with a general purpose LLM, using various metrics to assess vocabulary richness and overall accuracy.
Build a reverse image search engine with Amazon Titan Multimodal Embeddings in Amazon Bedrock and AWS managed services
In this post, you will learn how to extract key objects from image queries using Amazon Rekognition and build a reverse image search engine using Amazon Titan Multimodal Embeddings from Amazon Bedrock in combination with Amazon OpenSearch Serverless Service.
Use Amazon SageMaker Studio with a custom file system in Amazon EFS
In this post, we explore three scenarios demonstrating the versatility of integrating Amazon EFS with SageMaker Studio. These scenarios highlight how Amazon EFS can provide a scalable, secure, and collaborative data storage solution for data science teams.
How Northpower used computer vision with AWS to automate safety inspection risk assessments
In this post, we share how Northpower has worked with their technology partner Sculpt to reduce the effort and carbon required to identify and remediate public safety risks. Specifically, we cover the computer vision and artificial intelligence (AI) techniques used to combine datasets into a list of prioritized tasks for field teams to investigate and mitigate.
Control data access to Amazon S3 from Amazon SageMaker Studio with Amazon S3 Access Grants
In this post, we demonstrate how to simplify data access to Amazon S3 from SageMaker Studio using S3 Access Grants, specifically for different user personas using IAM principals.
Making traffic lights more efficient with Amazon Rekognition
In this blog post, we show you how Amazon Rekognition can mitigate congestion at traffic intersections and reduce operations and maintenance costs.
Accelerate development of ML workflows with Amazon Q Developer in Amazon SageMaker Studio
In this post, we present a real-world use case analyzing the Diabetes 130-US hospitals dataset to develop an ML model that predicts the likelihood of readmission after discharge.
Align Meta Llama 3 to human preferences with DPO, Amazon SageMaker Studio, and Amazon SageMaker Ground Truth
In this post, we show you how to enhance the performance of Meta Llama 3 8B Instruct by fine-tuning it using direct preference optimization (DPO) on data collected with SageMaker Ground Truth.
Use LangChain with PySpark to process documents at massive scale with Amazon SageMaker Studio and Amazon EMR Serverless
In this post, we explore how to build a scalable and efficient Retrieval Augmented Generation (RAG) system using the new EMR Serverless integration, Spark’s distributed processing, and an Amazon OpenSearch Service vector database powered by the LangChain orchestration framework. This solution enables you to process massive volumes of textual data, generate relevant embeddings, and store them in a powerful vector database for seamless retrieval and generation.
LLM experimentation at scale using Amazon SageMaker Pipelines and MLflow
Large language models (LLMs) have achieved remarkable success in various natural language processing (NLP) tasks, but they may not always generalize well to specific domains or tasks. You may need to customize an LLM to adapt to your unique use case, improving its performance on your specific dataset or task. You can customize the model […]