AWS Machine Learning Blog

Category: Generative AI

Use Amazon Bedrock Agents for code scanning, optimization, and remediation

For enterprises in the realm of cloud computing and software development, providing secure code repositories is essential. As sophisticated cybersecurity threats become more prevalent, organizations must adopt proactive measures to protect their assets. Amazon Bedrock offers a powerful solution by automating the process of scanning repositories for vulnerabilities and remediating them. This post explores how you can use Amazon Bedrock to enhance the security of your repositories and maintain compliance with organizational and regulatory standards.

Deploy Meta Llama 3.1-8B on AWS Inferentia using Amazon EKS and vLLM

In this post, we walk through the steps to deploy the Meta Llama 3.1-8B model on Inferentia 2 instances using Amazon EKS. This solution combines the exceptional performance and cost-effectiveness of Inferentia 2 chips with the robust and flexible landscape of Amazon EKS. Inferentia 2 chips deliver high throughput and low latency inference, ideal for LLMs.

Serving LLMs using vLLM and Amazon EC2 instances with AWS AI chips

The use of large language models (LLMs) and generative AI has exploded over the last year. With the release of powerful publicly available foundation models, tools for training, fine tuning and hosting your own LLM have also become democratized. Using vLLM on AWS Trainium and Inferentia makes it possible to host LLMs for high performance […]

Illustration of Semantic Cache

Build a read-through semantic cache with Amazon OpenSearch Serverless and Amazon Bedrock

This post presents a strategy for optimizing LLM-based applications. Given the increasing need for efficient and cost-effective AI solutions, we present a serverless read-through caching blueprint that uses repeated data patterns. With this cache, developers can effectively save and access similar prompts, thereby enhancing their systems’ efficiency and response times.

Accelerating Mixtral MoE fine-tuning on Amazon SageMaker with QLoRA

In this post, we demonstrate how you can address the challenges of model customization being complex, time-consuming, and often expensive by using fully managed environment with Amazon SageMaker Training jobs to fine-tune the Mixtral 8x7B model using PyTorch Fully Sharded Data Parallel (FSDP) and Quantized Low Rank Adaptation (QLoRA).

Amazon SageMaker Inference now supports G6e instances

G6e instances on SageMaker unlock the ability to deploy a wide variety of open source models cost-effectively. With superior memory capacity, enhanced performance, and cost-effectiveness, these instances represent a compelling solution for organizations looking to deploy and scale their AI applications. The ability to handle larger models, support longer context lengths, and maintain high throughput makes G6e instances particularly valuable for modern AI applications.

Orchestrate generative AI workflows with Amazon Bedrock and AWS Step Functions

This post discusses how to use AWS Step Functions to efficiently coordinate multi-step generative AI workflows, such as parallelizing API calls to Amazon Bedrock to quickly gather answers to lists of submitted questions. We also touch on the usage of Retrieval Augmented Generation (RAG) to optimize outputs and provide an extra layer of precision, as well as other possible integrations through Step Functions.

Build generative AI applications on Amazon Bedrock with the AWS SDK for Python (Boto3)

In this post, we demonstrate how to use Amazon Bedrock with the AWS SDK for Python (Boto3) to programmatically incorporate FMs. We explore invoking a specific FM and processing the generated text, showcasing the potential for developers to use these models in their applications for a variety of use cases

Amazon Bedrock Flows is now generally available with enhanced safety and traceability

Today, we are excited to announce the general availability of Amazon Bedrock Flows (previously known as Prompt Flows). With Bedrock Flows, you can quickly build and execute complex generative AI workflows without writing code. Bedrock Flows makes it easier for developers and businesses to harness the power of generative AI, enabling you to create more sophisticated and efficient AI-driven solutions for your customers.