AWS Machine Learning Blog

Empower your generative AI application with a comprehensive custom observability solution

In this post, we set up the custom solution for observability and evaluation of Amazon Bedrock applications. Through code examples and step-by-step guidance, we demonstrate how you can seamlessly integrate this solution into your Amazon Bedrock application, unlocking a new level of visibility, control, and continual improvement for your generative AI applications.

Automate Amazon Bedrock batch inference: Building a scalable and efficient pipeline

Although batch inference offers numerous benefits, it’s limited to 10 batch inference jobs submitted per model per Region. To address this consideration and enhance your use of batch inference, we’ve developed a scalable solution using AWS Lambda and Amazon DynamoDB. This post guides you through implementing a queue management system that automatically monitors available job slots and submits new jobs as slots become available.

Build a video insights and summarization engine using generative AI with Amazon Bedrock

This post presents a solution where you can upload a recording of your meeting (a feature available in most modern digital communication services such as Amazon Chime) to a centralized video insights and summarization engine. This engine uses artificial intelligence (AI) and machine learning (ML) services and generative AI on AWS to extract transcripts, produce a summary, and provide a sentiment for the call. The solution notes the logged actions per individual and provides suggested actions for the uploader. All of this data is centralized and can be used to improve metrics in scenarios such as sales or call centers.

Classify Flow

Automate document processing with Amazon Bedrock Prompt Flows (preview)

This post demonstrates how to build an IDP pipeline for automatically extracting and processing data from documents using Amazon Bedrock Prompt Flows, a fully managed service that enables you to build generative AI workflow using Amazon Bedrock and other services in an intuitive visual builder. Amazon Bedrock Prompt Flows allows you to quickly update your pipelines as your business changes, scaling your document processing workflows to help meet evolving demands.

Governing the ML lifecycle at scale: Centralized observability with Amazon SageMaker and Amazon CloudWatch

This post is part of an ongoing series on governing the machine learning (ML) lifecycle at scale. To start from the beginning, refer to Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker. A multi-account strategy is essential not only for improving governance but also for enhancing […]

Import data from Google Cloud Platform BigQuery for no-code machine learning with Amazon SageMaker Canvas

This post presents an architectural approach to extract data from different cloud environments, such as Google Cloud Platform (GCP) BigQuery, without the need for data movement. This minimizes the complexity and overhead associated with moving data between cloud environments, enabling organizations to access and utilize their disparate data assets for ML projects. We highlight the process of using Amazon Athena Federated Query to extract data from GCP BigQuery, using Amazon SageMaker Data Wrangler to perform data preparation, and then using the prepared data to build ML models within Amazon SageMaker Canvas, a no-code ML interface.

Customized model monitoring for near real-time batch inference with Amazon SageMaker

In this post, we present a framework to customize the use of Amazon SageMaker Model Monitor for handling multi-payload inference requests for near real-time inference scenarios. SageMaker Model Monitor monitors the quality of SageMaker ML models in production. Early and proactive detection of deviations in model quality enables you to take corrective actions, such as retraining models, auditing upstream systems, or fixing quality issues without having to monitor models manually or build additional tooling.

Super charge your LLMs with RAG at scale using AWS Glue for Apache Spark

In this post, we will explore building a reusable RAG data pipeline on LangChain—an open source framework for building applications based on LLMs—and integrating it with AWS Glue and Amazon OpenSearch Serverless. The end solution is a reference architecture for scalable RAG indexing and deployment.

From RAG to fabric: Lessons learned from building real-world RAGs at GenAIIC – Part 1

In this post, we cover the core concepts behind RAG architectures and discuss strategies for evaluating RAG performance, both quantitatively through metrics and qualitatively by analyzing individual outputs. We outline several practical tips for improving text retrieval, including using hybrid search techniques, enhancing context through data preprocessing, and rewriting queries for better relevance.