AWS Machine Learning Blog
Category: Technical How-to
How Formula 1® uses generative AI to accelerate race-day issue resolution
In this post, we explain how F1 and AWS have developed a root cause analysis (RCA) assistant powered by Amazon Bedrock to reduce manual intervention and accelerate the resolution of recurrent operational issues during races from weeks to minutes. The RCA assistant enables the F1 team to spend more time on innovation and improving its services, ultimately delivering an exceptional experience for fans and partners. The successful collaboration between F1 and AWS showcases the transformative potential of generative AI in empowering teams to accomplish more in less time.
Using Amazon Rekognition to improve bicycle safety
To better protect themselves, many cyclists are starting to ride with cameras mounted to the front or back of their bicycle. In this blog post, I will demonstrate a machine learning solution that cyclists can use to better identify close calls. The architecture of the solution uses Amazon Rekognition to detect vehicles in recorded bike ride videos. It then analyzes the video to determine if any vehicles are passing too close to the cyclist, within the 3-foot safe distance required by law. The solution automatically generates video clips of these dangerous passing events, which can then be shared with authorities to help improve cyclist safety.
Use language embeddings for zero-shot classification and semantic search with Amazon Bedrock
In this post, we explore what language embeddings are and how they can be used to enhance your application. We show how, by using the properties of embeddings, we can implement a real-time zero-shot classifier and can add powerful features such as semantic search.
Fine-tune LLMs with synthetic data for context-based Q&A using Amazon Bedrock
In this post, we explore how to use Amazon Bedrock to generate synthetic training data to fine-tune an LLM. Additionally, we provide concrete evaluation results that showcase the power of synthetic data in fine-tuning when data is scarce.
Achieve ~2x speed-up in LLM inference with Medusa-1 on Amazon SageMaker AI
Researchers developed Medusa, a framework to speed up LLM inference by adding extra heads to predict multiple tokens simultaneously. This post demonstrates how to use Medusa-1, the first version of the framework, to speed up an LLM by fine-tuning it on Amazon SageMaker AI and confirms the speed up with deployment and a simple load test. Medusa-1 achieves an inference speedup of around two times without sacrificing model quality, with the exact improvement varying based on model size and data used. In this post, we demonstrate its effectiveness with a 1.8 times speedup observed on a sample dataset.
Amazon Q Business simplifies integration of enterprise knowledge bases at scale
In this post, we demonstrate how to build a knowledge base solution by integrating enterprise data with Amazon Q Business using Amazon S3. This approach helps organizations improve operational efficiency, reduce response times, and gain valuable insights from their historical data. The solution uses AWS security best practices to promote data protection while enabling teams to create a comprehensive knowledge base from various data sources.
Faster distributed graph neural network training with GraphStorm v0.4
GraphStorm is a low-code enterprise graph machine learning (ML) framework that provides ML practitioners a simple way of building, training, and deploying graph ML solutions on industry-scale graph data. In this post, we demonstrate how GraphBolt enhances GraphStorm’s performance in distributed settings. We provide a hands-on example of using GraphStorm with GraphBolt on SageMaker for distributed training. Lastly, we share how to use Amazon SageMaker Pipelines with GraphStorm.
Automate bulk image editing with Crop.photo and Amazon Rekognition
In this post, we explore how Crop.photo uses Amazon Rekognition to provide sophisticated image analysis, enabling automated and precise editing of large volumes of images. This integration streamlines the image editing process for clients, providing speed and accuracy, which is crucial in the fast-paced environments of ecommerce and sports.
Governing the ML lifecycle at scale, Part 4: Scaling MLOps with security and governance controls
This post provides detailed steps for setting up the key components of a multi-account ML platform. This includes configuring the ML Shared Services Account, which manages the central templates, model registry, and deployment pipelines; sharing the ML Admin and SageMaker Projects Portfolios from the central Service Catalog; and setting up the individual ML Development Accounts where data scientists can build and train models.
Fine-tune and host SDXL models cost-effectively with AWS Inferentia2
As technology continues to evolve, newer models are emerging, offering higher quality, increased flexibility, and faster image generation capabilities. One such groundbreaking model is Stable Diffusion XL (SDXL), released by StabilityAI, advancing the text-to-image generative AI technology to unprecedented heights. In this post, we demonstrate how to efficiently fine-tune the SDXL model using SageMaker Studio. We show how to then prepare the fine-tuned model to run on AWS Inferentia2 powered Amazon EC2 Inf2 instances, unlocking superior price performance for your inference workloads.