AWS Machine Learning Blog
Category: Generative AI
Use LangChain with PySpark to process documents at massive scale with Amazon SageMaker Studio and Amazon EMR Serverless
In this post, we explore how to build a scalable and efficient Retrieval Augmented Generation (RAG) system using the new EMR Serverless integration, Spark’s distributed processing, and an Amazon OpenSearch Service vector database powered by the LangChain orchestration framework. This solution enables you to process massive volumes of textual data, generate relevant embeddings, and store them in a powerful vector database for seamless retrieval and generation.
Best practices for prompt engineering with Meta Llama 3 for Text-to-SQL use cases
In this post, we explore a solution that uses the vector engine ChromaDB and Meta Llama 3, a publicly available foundation model hosted on SageMaker JumpStart, for a Text-to-SQL use case. We shared a brief history of Meta Llama 3, best practices for prompt engineering with Meta Llama 3 models, and an architecture pattern using few-shot prompting and RAG to extract the relevant schemas stored as vectors in ChromaDB.
Implementing advanced prompt engineering with Amazon Bedrock
In this post, we provide insights and practical examples to help balance and optimize the prompt engineering workflow. We focus on advanced prompt techniques and best practices for the models provided in Amazon Bedrock, a fully managed service that offers a choice of high-performing foundation models from leading AI companies such as Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API. With these prompting techniques, developers and researchers can harness the full capabilities of Amazon Bedrock, providing clear and concise communication while mitigating potential risks or undesirable outputs.
Accelerate Generative AI Inference with NVIDIA NIM Microservices on Amazon SageMaker
In this post, we provide a walkthrough of how customers can use generative artificial intelligence (AI) models and LLMs using NVIDIA NIM integration with SageMaker. We demonstrate how this integration works and how you can deploy these state-of-the-art models on SageMaker, optimizing their performance and cost.
Connect the Amazon Q Business generative AI coding companion to your GitHub repositories with Amazon Q GitHub (Cloud) connector
In this post, we show you how to perform natural language queries over the indexed GitHub (Cloud) data using the AI-powered chat interface provided by Amazon Q Business. We also cover how Amazon Q Business applies access control lists (ACLs) associated with the indexed documents to provide permissions-filtered responses.
Build an end-to-end RAG solution using Knowledge Bases for Amazon Bedrock and the AWS CDK
In this post, we demonstrate how to seamlessly automate the deployment of an end-to-end RAG solution using Knowledge Bases for Amazon Bedrock and the AWS Cloud Development Kit (AWS CDK), enabling organizations to quickly set up a powerful question answering system.
Index website contents using the Amazon Q Web Crawler connector for Amazon Q Business
In this post, we demonstrate how to create an Amazon Q Business application and index website contents using the Amazon Q Web Crawler connector for Amazon Q Business. We use two data sources (websites) here. The first data source is an employee onboarding guide from a fictitious company, which requires basic authentication. We demonstrate how to set up authentication for the Web Crawler. The second data source is the official documentation for Amazon Q Business. For this data source, we demonstrate how to apply advanced settings to instruct the Web Crawler to crawl only pages and links related to Amazon Q Business.
Getting started with cross-region inference in Amazon Bedrock
Today, we are happy to announce the general availability of cross-region inference, a powerful feature allowing automatic cross-region inference routing for requests coming to Amazon Bedrock. This offers developers using on-demand inference mode, a seamless solution for managing optimal availability, performance, and resiliency while managing incoming traffic spikes of applications powered by Amazon Bedrock. By opting in, developers no longer have to spend time and effort predicting demand fluctuations.
Secure RAG applications using prompt engineering on Amazon Bedrock
In this post, we discuss existing prompt-level threats and outline several security guardrails for mitigating prompt-level threats. For our example, we work with Anthropic Claude on Amazon Bedrock, implementing prompt templates that allow us to enforce guardrails against common security threats such as prompt injection. These templates are compatible with and can be modified for other LLMs.
Get the most from Amazon Titan Text Premier
In this post, we introduce the new Amazon Titan Text Premier model, specifically optimized for enterprise use cases, such as building Retrieval Augmented Generation (RAG) and agent-based applications. Such integrations enable advanced applications like building interactive AI assistants that use enterprise APIs and interact with your propriety documents.