AWS Machine Learning Blog

Introducing Amazon Kendra GenAI Index – Enhanced semantic search and retrieval capabilities

Amazon has introduced the Amazon Kendra GenAI Index, a new offering designed to enhance semantic search and retrieval capabilities for enterprise AI applications. This index is optimized for Retrieval Augmented Generation (RAG) and intelligent search, allowing businesses to build more effective digital assistants and search experiences.

Building Generative AI and ML solutions faster with AI apps from AWS partners using Amazon SageMaker

Today, we’re excited to announce that AI apps from AWS Partners are now available in SageMaker. You can now find, deploy, and use these AI apps privately and securely, all without leaving SageMaker AI, so you can develop performant AI models faster.

Query structured data from Amazon Q Business using Amazon QuickSight integration

In this post, we show how Amazon Q Business integrates with QuickSight to enable users to query both structured and unstructured data in a unified way. The integration allows users to connect to over 20 structured data sources like Amazon Redshift and PostgreSQL, while getting real-time answers with visualizations. Amazon Q Business combines information from structured sources through QuickSight with unstructured content to provide comprehensive answers to user queries.

Elevate customer experience by using the Amazon Q Business custom plugin for New Relic AI

The New Relic AI custom plugin for Amazon Q Business creates a unified solution that combines New Relic AI’s observability insights and recommendations and Amazon Q Business’s Retrieval Augmented Generation (RAG) capabilities, in and a natural language interface for ease of use. This post explores the use case, how this custom plugin works, how it can be enabled, and how it can help elevate customers’ digital experiences.

Amazon SageMaker launches the updated inference optimization toolkit for generative AI

Today, Amazon SageMaker is excited to announce updates to the inference optimization toolkit, providing new functionality and enhancements to help you optimize generative AI models even faster.In this post, we discuss these new features of the toolkit in more detail.

Syngenta develops a generative AI assistant to support sales representatives using Amazon Bedrock Agents

In this post, we explore how Syngenta collaborated with AWS to develop Cropwise AI, a generative AI assistant powered by Amazon Bedrock Agents that helps sales representatives make better seed product recommendations to farmers across North America. The solution transforms the seed selection process by simplifying complex data into natural conversations, providing quick access to detailed seed product information, and enabling personalized recommendations at scale through a mobile app interface.

Speed up your AI inference workloads with new NVIDIA-powered capabilities in Amazon SageMaker

At re:Invent 2024, we are excited to announce new capabilities to speed up your AI inference workloads with NVIDIA accelerated computing and software offerings on Amazon SageMaker. In this post, we will explore how you can use these new capabilities to enhance your AI inference on Amazon SageMaker. We’ll walk through the process of deploying NVIDIA NIM microservices from AWS Marketplace for SageMaker Inference. We’ll then dive into NVIDIA’s model offerings on SageMaker JumpStart, showcasing how to access and deploy the Nemotron-4 model directly in the JumpStart interface. This will include step-by-step instructions on how to find the Nemotron-4 model in the JumpStart catalog, select it for your use case, and deploy it with a few clicks.

Unlock cost savings with the new scale down to zero feature in SageMaker Inference

Today at AWS re:Invent 2024, we are excited to announce a new feature for Amazon SageMaker inference endpoints: the ability to scale SageMaker inference endpoints to zero instances. This long-awaited capability is a game changer for our customers using the power of AI and machine learning (ML) inference in the cloud.

Supercharge your auto scaling for generative AI inference – Introducing Container Caching in SageMaker Inference

Today at AWS re:Invent 2024, we are excited to announce the new Container Caching capability in Amazon SageMaker, which significantly reduces the time required to scale generative AI  models for inference. This innovation allows you to scale your models faster, observing up to 56% reduction in latency when scaling a new model copy and up to 30% when adding a model copy on a new instance. In this post, we explore the new Container Caching feature for SageMaker inference, addressing the challenges of deploying and scaling large language models (LLMs).

Introducing Fast Model Loader in SageMaker Inference: Accelerate autoscaling for your Large Language Models (LLMs) – part 1

Today at AWS re:Invent 2024, we are excited to announce a new capability in Amazon SageMaker Inference that significantly reduces the time required to deploy and scale LLMs for inference using LMI: Fast Model Loader. In this post, we delve into the technical details of Fast Model Loader, explore its integration with existing SageMaker workflows, discuss how you can get started with this powerful new feature, and share customer success stories.