AWS Compute Blog

Category: Amazon Bedrock

Orchestrating large-scale document processing with AWS Step Functions and Amazon Bedrock batch inference

Organizations often have large volumes of documents containing valuable information that remains locked away and unsearchable. This solution addresses the need for a scalable, automated text extraction and knowledge base pipeline that transforms static document collections into intelligent, searchable repositories for generative AI applications.

Serverless strategies for streaming LLM responses

Modern generative AI applications often need to stream large language model (LLM) outputs to users in real-time. Instead of waiting for a complete response, streaming delivers partial results as they become available, which significantly improves the user experience for chat interfaces and long-running AI tasks. This post compares three serverless approaches to handle Amazon Bedrock LLM streaming on Amazon Web Services (AWS), which helps you choose the best fit for your application.

Effectively building AI agents on AWS Serverless

Imagine an AI assistant that doesn’t just respond to prompts – it reasons through goals, acts, and integrates with real-time systems. This is the promise of agentic AI. According to Gartner, by 2028 over 33% of enterprise applications will embed agentic capabilities – up from less than 1% today. While early generative AI efforts focused […]

Orchestrating document processing with AWS AppSync Events and Amazon Bedrock

Many organizations implement intelligent document processing pipelines in order to extract meaningful insights from an increasing volume of unstructured content (such as insurance claims, loan applications and more). Traditionally, these pipelines require significant engineering efforts, as the implementation often involves using several machine learning (ML) models and orchestrating complex workflows. As organizations integrate these pipelines […]

Optimizing ODCR usage through AI-powered capacity insights

Efficient resource management is crucial for organizations seeking to optimize cloud costs while making sure of seamless access to compute capacity. Amazon EC2 On-Demand Capacity Reservations (ODCRs) provide the flexibility to reserve compute capacity within a specific Availability Zone (AZ) for any duration. In this post, we demonstrate how Amazon Bedrock Agents can help organizations gain actionable insights into ODCR usage across their AWS environment.

Serverless calendar Q1 2025

Serverless ICYMI 2025 Q1

Welcome to the 28th edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. At the end of a quarter, we share the most recent product launches, feature enhancements, blog posts, videos, live streams, and other interesting things that you might have missed! In case you missed our last ICYMI, check out […]

Image of Transition from JSONPath to JSONata.

Simplifying developer experience with variables and JSONata in AWS Step Functions

This post is written by Uma Ramadoss, Principal Specialist SA, Serverless and Dhiraj Mahapatro, Principal Specialist SA, Amazon Bedrock AWS Step Functions is introducing variables and JSONata data transformations. Variables allow developers to assign data in one state and reference it in any subsequent steps, simplifying state payload management without the need to pass data […]

re:Invent Banner

The serverless attendee’s guide to AWS re:Invent 2024

AWS re:Invent 2024 offers an extensive selection of serverless and application integration content. AWS re:Invent Banner For detailed descriptions and schedule, visit the AWS re:Invent Session Catalog. Join AWS serverless experts and community members at the AWS Modern Apps and Open Source Zone in the AWS Expo Village. This serves as a hub for serverless […]

Architecture diagram showing AWS Lambda invoking Amazon Bedrock using the InvokeModel API call.

Designing Serverless Integration Patterns for Large Language Models (LLMs)

This post is written by Josh Hart, Principal Solutions Architect and Thomas Moore, Senior Solutions Architect This post explores best practice integration patterns for using large language models (LLMs) in serverless applications. These approaches optimize performance, resource utilization, and resilience when incorporating generative AI capabilities into your serverless architecture. Overview of serverless, LLMs and example […]

Calendar

Serverless ICYMI Q2 2024

Welcome to the 26th edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share all the most recent product launches, feature enhancements, blog posts, webinars, live streams, and other interesting things that you might have missed! In case you missed our last ICYMI, check out what happened last […]