Artificial Intelligence | AWS Compute Blog

Building responsive APIs with Amazon API Gateway response streaming

Today, AWS announced support for response streaming in Amazon API Gateway to significantly improve the responsiveness of your REST APIs by progressively streaming response payloads back to the client. With this new capability, you can use streamed responses to enhance user experience when building LLM-driven applications (such as AI agents and chatbots), improve time-to-first-byte (TTFB) performance for web and mobile applications, stream large files, and perform long-running operations while reporting incremental progress using protocols such as server-sent events (SSE).

Serverless generative AI architectural patterns – Part 1

This two-part series explores the different architectural patterns, best practices, code implementations, and design considerations essential for successfully integrating generative AI solutions into both new and existing applications. In this post, we focus on patterns applicable for architecting real-time generative AI applications.

Effectively building AI agents on AWS Serverless

Imagine an AI assistant that doesn’t just respond to prompts – it reasons through goals, acts, and integrates with real-time systems. This is the promise of agentic AI. According to Gartner, by 2028 over 33% of enterprise applications will embed agentic capabilities – up from less than 1% today. While early generative AI efforts focused […]

Orchestrating document processing with AWS AppSync Events and Amazon Bedrock

Many organizations implement intelligent document processing pipelines in order to extract meaningful insights from an increasing volume of unstructured content (such as insurance claims, loan applications and more). Traditionally, these pipelines require significant engineering efforts, as the implementation often involves using several machine learning (ML) models and orchestrating complex workflows. As organizations integrate these pipelines […]

Optimizing ODCR usage through AI-powered capacity insights

Efficient resource management is crucial for organizations seeking to optimize cloud costs while making sure of seamless access to compute capacity. Amazon EC2 On-Demand Capacity Reservations (ODCRs) provide the flexibility to reserve compute capacity within a specific Availability Zone (AZ) for any duration. In this post, we demonstrate how Amazon Bedrock Agents can help organizations gain actionable insights into ODCR usage across their AWS environment.

Serverless ICYMI 2025 Q1

Welcome to the 28th edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. At the end of a quarter, we share the most recent product launches, feature enhancements, blog posts, videos, live streams, and other interesting things that you might have missed! In case you missed our last ICYMI, check out […]

Image of Transition from JSONPath to JSONata.

Simplifying developer experience with variables and JSONata in AWS Step Functions

This post is written by Uma Ramadoss, Principal Specialist SA, Serverless and Dhiraj Mahapatro, Principal Specialist SA, Amazon Bedrock AWS Step Functions is introducing variables and JSONata data transformations. Variables allow developers to assign data in one state and reference it in any subsequent steps, simplifying state payload management without the need to pass data […]

The serverless attendee’s guide to AWS re:Invent 2024

AWS re:Invent 2024 offers an extensive selection of serverless and application integration content. AWS re:Invent Banner For detailed descriptions and schedule, visit the AWS re:Invent Session Catalog. Join AWS serverless experts and community members at the AWS Modern Apps and Open Source Zone in the AWS Expo Village. This serves as a hub for serverless […]

Architecture diagram showing AWS Lambda invoking Amazon Bedrock using the InvokeModel API call.

Designing Serverless Integration Patterns for Large Language Models (LLMs)

This post is written by Josh Hart, Principal Solutions Architect and Thomas Moore, Senior Solutions Architect This post explores best practice integration patterns for using large language models (LLMs) in serverless applications. These approaches optimize performance, resource utilization, and resilience when incorporating generative AI capabilities into your serverless architecture. Overview of serverless, LLMs and example […]

Serverless ICYMI Q2 2024

Welcome to the 26th edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share all the most recent product launches, feature enhancements, blog posts, webinars, live streams, and other interesting things that you might have missed! In case you missed our last ICYMI, check out what happened last […]

AWS Compute Blog

Category: Artificial Intelligence