AWS Machine Learning Blog

Build generative AI applications quickly with Amazon Bedrock IDE in Amazon SageMaker Unified Studio

Building generative AI applications presents significant challenges for organizations: they require specialized ML expertise, complex infrastructure management, and careful orchestration of multiple services. To address these challenges, we introduce Amazon Bedrock IDE, an integrated environment for developing and customizing generative AI applications. Formerly known as Amazon Bedrock Studio, Amazon Bedrock IDE is now incorporated into the Amazon SageMaker Unified Studio (currently in preview). SageMaker Unified Studio combines various AWS services, including Amazon Bedrock, Amazon SageMaker, Amazon Redshift, Amazon Glue, Amazon Athena, and Amazon Managed Workflows for Apache Airflow (MWAA), into a comprehensive data and AI development platform. In this blog post, we’ll focus on Amazon Bedrock IDE and its generative AI capabilities within the Amazon SageMaker Unified Studio environment.

Consider a global retail site operating across multiple regions and countries. Its sales analysts face a daily challenge: they need to make data-driven decisions but are overwhelmed by the volume of available information. They have structured data such as sales transactions and revenue metrics stored in databases, alongside unstructured data such as customer reviews and marketing reports collected from various channels. Without specialized structured query language (SQL) knowledge or Retrieval Augmented Generation (RAG) expertise, these analysts struggle to combine insights effectively from both sources.

In this post, we’ll show how anyone in your company can use Amazon Bedrock IDE to quickly create a generative AI chat agent application that analyzes sales performance data. Through simple conversations, business teams can use the chat agent to extract valuable insights from both structured and unstructured data sources without writing code or managing complex data pipelines. The following diagram illustrates the conceptual architecture of an AI assistant with Amazon Bedrock IDE.

SageMaker Unified Studio simple architecture diagram

Solution overview

The AI chat agent application combines structured and unstructured data analysis through Amazon Bedrock IDE:

  • For structured data: connects to sales records in Amazon Athena, translating natural language into SQL queries
  • For unstructured data: uses Amazon Titan Text Embeddings and Amazon OpenSearch to enable semantic search across customer reviews and marketing reports

The Amazon Bedrock IDE interface seamlessly combines results from both sources, delivering comprehensive insights without requiring users to understand the underlying data structures or query languages. The following figure illustrates the workflow from initial user interaction to final response. For more details on the user interaction flow, check out our associated GitHub repository.

Solution architecture

Bedrock IDE architecture diagram

The architecture in the preceding figure shows how Amazon Bedrock IDE orchestrates the data flow. When users pose questions through the natural language interface, the chat agent determines whether to query the structured data in Amazon Athena through the Amazon Bedrock IDE function, search the Amazon Bedrock knowledge base, or combine both sources for comprehensive insights. This approach enables sales, marketing, product, and supply chain teams to make data-driven decisions efficiently, regardless of their technical expertise. For example, by the end of this tutorial, you will be able to query the data with prompts such as “Can you return our five top selling products this quarter and the principal customer complaints for each?” or “Were there any supply chain issues that could have affected our North American market for clothing sales?”

In the following sections, we’ll guide you through setting up your SageMaker Unified Studio project, creating your knowledge base, building the natural language query interface, and testing the solution.

SageMaker Unified Studio setup

SageMaker Unified Studio is a browser-based web application where you can use all your data and tools for analytics and AI. SageMaker Unified Studio can authenticate you with your AWS Identity and Access Management (IAM) credentials, credentials from your identity provider through the AWS IAM Identity Center, or with your SAML credentials.

You can obtain the SageMaker Unified Studio URL for your domains by accessing the AWS Management Console for Amazon DataZone. Follow the steps in the Administrator Guide to set up your SageMaker Unified Studio.

Building a generative AI application

SageMaker Unified Studio offers tools to discover and build with generative AI. To get started, you need to build a project.

  1. Open SageMaker Unified Studio and choose Generative AI playground at the top of the page.

SageMaker Unified Studio simple landing page

  1. Here, you can explore, experiment and compare various foundation models (FMs) through a chat interface.

Bedrock IDE - Generative AI playground

Similarly, you can explore image and video models with the Image & video playground.

  1. To begin creating your chat agent, choose Build chat agent in the chat playground window. You will now create a new project before building your app. Choose Create project.

Build chat agent

  1. Enter a project name. Next, select Generative AI application development from the available profiles. This profile includes all the necessary elements for working with Amazon Bedrock components in your generative AI application development. Choose Continue.

Bedrock IDE - Create project view

  1. On the next screen, leave all settings at their default values. Choose Continue to move to the next screen and choose the Create Project button to initiate the project creation process. The system will take a few minutes to set up your project.

Bedrock IDE - Create project view confirmation

After you’ve created your project, you can begin building your generative AI application.

Prerequisites

Before creating your application in Amazon Bedrock IDE, you’ll need to set up a few resources in your AWS account. This will provision the backend infrastructure and services that the sales analytics application will rely on. This includes setting up Amazon API Gateway, AWS Lambda functions, and Amazon Athena to enable querying the structured sales data.

  1. Deploy the required AWS resources:
    1. Launch the AWS CloudFormation stack in your preferred AWS Region:
    2. After the stack is deployed, note down the API Gateway URL value from the CloudFormation outputs tab: TextToSqlEngineAPIGatewayURL.
    3. Navigate to the AWS Secrets Manager console and find the secret <StackName>-api-keys. Choose Retrieve secret and copy the apiKey value from the plaintext string {"clientId":"default","allowedOperations":["query"],"apiKey":"xxxxxxxx"}.

You’ll need these values when setting up your Amazon Bedrock IDE function later.

  1. Download all three sample data files. These files contain synthetic data generated by a generative AI model, including customer reviews, customer survey responses, and world news that you’ll use to build your knowledge base:
  2. Download the API configuration: openapi_schema.json. You’ll use this file when setting up your function to query sales data.

That’s it! With these resources ready, you can create your sales analytics application. Each subsequent section will guide you through exactly when and how to use these files.

Instructions configuration for the chat agent

Go to Amazon Bedrock IDE chat agent application. Select model from dropdown (this can be changed later – ensure it supports data and functions). In chat agent instructions field, enter:

You are a Sales Analytics agent with access to sales data in the "sales" database, table "sales_records". Your tasks include analyzing metrics, providing sales insights, and answering data questions.
Table Schema:
- region, country: Location data
- item_type: Product category
- sales_channel: Online/Offline
- order_priority: H/M/L/C
- order_date, ship_date: Timing
- order_id: Unique identifier
- units_sold: Quantity
- unit_price, unit_cost: Price metrics
- total_revenue, total_cost, total_profit: Financial metrics.
Use Amazon Athena SQL queries to provide insights. Format responses with:
	1	SQL query used
	2	Business interpretation
	3	Key insights/recommendations
You can also access sales-repo which contains details on products categories, customer reviews, etc.
Error Handling:
- If the user's query cannot be translated into a valid SQL query, or the SQL is invalid or fails to execute, provide a clear and informative error message.

This instruction will guide the AI application to act as a sales analytics agent, providing structured responses based on the given sales data schema in addition to accessing the product reviews and other sales-related data.

Chat agent application building view

For this application, you will create two main components: a knowledge base to handle unstructured data, and a function that uses Amazon Athena to query the structured data. These components will work together to process and retrieve information for your generative AI application.

Creating a knowledge base

Knowledge bases enable your application to analyze unstructured data like customer reviews and news stories.

  1. Select the Data section on the current chat agent screen.
  2. Choose Create new Knowledge Base and enter a name for your new knowledge base. You also need to enter a brief description for the chat agent to understand the purpose of this Knowledge Base:

This contains product-specific reviews from users, user feedback gathered via survey, and recent industry and economic news

  1. You have two options for configuring your knowledge base data sources, you can either use local files or you can configure a web crawler. Web scraping automatically extracts content from public web pages that you have permission to access. By adding website URLs to the tool, it will crawl these sites and create a knowledge base from all the defined URLs. This process allows you to efficiently gather and structure information from multiple web sources, building a comprehensive dataset for your needs.

For this post, you’ll upload the files containing unstructured data that we mentioned previously (product-reviews.txt, survey-response.txt, and world-news.txt).

  1. Choose Click to upload, and upload the three files. Keep the default parsing settings.
  2. For the embeddings model, select Amazon Titan Text Embeddings V2 or any other embeddings model.
  3. Select OpenSearch Serverless as your vector store.
  4. After you’ve made these selections, choose Create to create your knowledge base.
  5. After the knowledge base creation is finished, select it for your application under the Select Knowledge Base dropdown.

This unstructured data will be used to enhance the knowledge base of your generative AI application, allowing it to process and provide insights based on customer reviews, survey responses, and current news relevant to your business sector.

Create Knowledge Base

To verify that the unstructured data connection is working properly, submit this example prompt into the application. Note that generative AI systems are nondeterministic, so responses will not be the same every time.

Prompt:

What is the customer sentiment surrounding our Office Supplies products?

Expected response:

Based on the customer reviews and feedback, the sentiment surrounding our Office Supplies products is mixed. While there are many positive reviews praising the product selection, store environment, and helpful staff, there are also a significant number of negative reviews citing issues like:

Poor product quality (e.g. flimsy materials, inconsistent paper quality, adhesive issues)

Frustrating in-store experiences (e.g. long wait times, rude staff, messy/disorganized stores)

Problems with online ordering and apps (e.g. filters not working properly, payment issues, logging out repeatedly)

To improve customer sentiment, we should focus on addressing these common pain points through better quality control, more efficient inventory management and staffing, and improving the online/app user experience.

Creating a function

In this section, you will create a function that will interact with Amazon API Gateway to query the database, which then forwards requests to the Lambda function that retrieves data from Amazon Simple Storage Service (Amazon S3) and processes SQL queries using Amazon Athena. The AWS infrastructure has already been deployed as part of the CloudFormation template. The structured dataset includes order information for products spanning from 2010 to 2017. This historical data will allow the function to analyze sales trends, product performance, and other relevant metrics over this seven-year period. The application will use this function to integrate structured data analysis capabilities, enabling it to provide insights based on concrete sales data alongside the unstructured data from reviews and news that are already incorporated.

  1. In your Amazon Bedrock IDE Chat agent application, expand the Functions section on the screen. Choose Create New Function.
  2. Enter a name for the function and provide a description.
  3. For the function schema, select Import JSON/YAML. Import the API schema from the openapi_schema.json file that you downloaded earlier.
  4. Important: After importing, you need to modify the API endpoint URL in the schema. Replace it with the actual value from the CloudFormation stack output TextToSqlEngineAPIGatewayURL. This step makes sure that your function is correctly linked to the appropriate API endpoint for your application.

Create a function

  1. For the Authentication method, select API Keys (Max. 2 Keys) and enter the following details:
    • Key sent in: Header
    • Key name: x-api-key
    • Key value: enter the Amazon Secrets Manager apiKey value.
  2. Choose Create and wait until the function creation is complete.
  3. After the function creation is finished, select it for your application under Functions dropdown.

Create a function confirmation

To verify that the structured data connection is working properly, submit the following example query into the application. Note that generative AI systems are nondeterministic, so responses will not be the same every time.

Prompt:

List all the regions that we do business in.

Expected response:

Based on the query to get distinct region values from the sales_records table, the regions where we do business are:
Europe
Australia and Oceania
North America
Central America and the Caribbean
Sub-Saharan Africa
Middle East and North Africa
Asia

Sharing your application

After you’ve built your application, you can share it with other users in your organization through SageMaker Unified Studio.

  1. In the application interface, locate and choose Share in the top right corner.
  2. In the sharing dialog, search for users by their alias and choose Invite to include them in the sharing list.
  1. After adding all desired users, copy the application URL from the sharing dialog and send the URL to the added users through your preferred communication channel.

Note: if you turn link sharing on, anyone with the link will be able to subscribe and use the app. If you add their names specifically, only those users can see the app, and it will appear under the “Shared generative AI assets” section for them.

Users must have valid SageMaker Unified Studio access credentials to use the shared application. Contact your AWS administrator if users encounter access issues.

Application sharing view

Examples

The following examples demonstrate how a global retail site uses this solution to transform their sales analytics process and extract valuable insights. Let’s explore three types of queries that demonstrate the power of this approach:

  • Analyzing structured data to understand sales performance
  • Analyzing unstructured customer feedback to extract insights
  • Combining both data sources for comprehensive business intelligence

In the following examples, we’ll show how sales analysts can extract valuable insights through basic conversational queries.

Keep in mind that generative AI systems are nondeterministic, so responses will not be the same every time. Generative AI systems might also hallucinate information, and care should be taken to evaluate responses for correctness. Additionally, the structured data source is randomly generated on download, so our results will not necessarily agree. You can access the CSV data source in Amazon S3 to evaluate the results manually. Our examples used the Anthropic Claude 3 Sonnet model, but feel free to experiment with others.

Structured data examples

The feature columns that can be queried are the following: Region, Country, Item Type, Sales Channel, Order Priority, Order Date, Order ID, Ship Date, Units Sold, Unit Price, Unit Cost, Total Revenue, Total Cost, and Total Profit. The data spans seven regions and over a hundred countries, with the following item types represented: Baby Food, Beverages, Cereal, Clothes, Cosmetics, Fruits, Household, Meat, Office Supplies, Personal Care, Snacks, and Vegetables.

Prompt 1:

What region accounts for our highest revenue, and how much revenue is that?

Expected Response 1:

The region that accounts for our highest revenue is "Sub-Saharan Africa" with a total revenue of $356,724,250.12.

Prompt 2:

Which 3 item types account for our most units sold?

Expected Response 2:

The 3 item types that account for our most units sold are:
Beverages with 504,905 units
Personal Care with 475,724 units
Vegetables with 471,276 units

Unstructured data examples

The following examples demonstrate a few natural language prompts that you can use to query unstructured data. The goal of these queries is to query textual data to answer questions and identify common themes.

The data that can be queried includes product-specific reviews from users, online compared to offline feedback gathered through surveys, and recent industry and economic news.

Prompt 1:

Search our customer surveys to see if customers seem to prefer our online or offline shopping experience

Expected Response 1:

Based on the customer survey results from the knowledge base, there seems to be a preference for the in-store/offline shopping experience over the online experience (website and mobile app). 
The in-store reviews frequently mention positive aspects like helpful staff, clean stores, good product selection and organization. In contrast, the online reviews cite many issues with the website and mobile app like slow loading times, crashes, poor search functionality, and difficulties during checkout. 
While there are some positive online reviews as well, the negative experiences seem more prevalent for the online channels compared to the offline in-store experience.

Prompt 2:

Were there any major world events in 2016 affecting the sale of Vegetables?

Expected Response 2:

Based on the search results from the knowledge base, there were a few major events in 2016 that may have impacted vegetable sales:
In July 2016, an extended drought in California's Central Valley resulted in reduced vegetable production, affecting supply chains for major retailers across North America.
In April 2016, Morocco's innovative desert greenhouse project began operations, introducing new competition in the Mediterranean vegetable market and affecting prices in Southern Europe.
In June 2016, India's successful implementation of cold chain infrastructure reduced vegetable spoilage by 35%, stabilizing prices across South Asian markets.
So the drought in California likely reduced vegetable supply and increased prices in North America, while the new greenhouse operations in Morocco introduced more competition in Europe. However, improved cold chain logistics in India helped stabilize vegetable prices in that region.

Combined data examples

The following are examples of a few natural language prompts that you can use to query both structured and unstructured data sources and seamlessly combine the results into valuable business insights.

Prompt 1:

Do customers seem to prefer our online or offline shopping experience?
Gather data from both customer surveys and number of units sold on different sales channels (online and offline),
and try to identify a trend to see if the data corroborates these reviews.

Expected Response 1:

Based on the customer reviews from the knowledge base, there seems to be a preference for the in-store/offline shopping experience over online shopping. Common complaints for online included slow websites, technical glitches, poor search functionality, and issues during checkout. 
In contrast, the in-store experience received more positive reviews praising helpful staff, clean stores, and good product selection.
The overall sales data shows that the Offline sales channel had higher total revenue ($698.9M) and units sold (2.69M) compared to Online ($628.4M revenue, 2.36M units). 
This data corroborates the customer review sentiments, indicating customers may prefer the offline/in-store experience for our products and services.
However, it is important to note that both channels received a mix of positive and negative reviews. To improve the overall customer experience, we should analyze the specific issues raised in the reviews and look for ways to enhance both the online and offline channels based on that feedback.

Prompt 2:

Can you identify any major events that may have affected the sales revenue of Baby Food?
What about any product reviews that may reflect the trends in revenue?
Use order dates and news article publishing dates as you look for trends.

Expected Response 2:

Based on analyzing the sales revenue data for the Baby Food item type and searching relevant unstructured data sources, here are some major events that appear to have impacted sales revenue:
In November 2015, Japan introduced stricter quality control standards for imported baby foods (source 5). This appears to have disrupted supply chains in Asia and may have driven the spike in revenue that month as companies rushed to get products into Japan before the new standards took effect.
In August 2016, Mexico reported a breakthrough in avocado cultivation that increased yields by 25% (source 3). This improved supply of a key baby food ingredient and may have contributed to the high revenue in late 2016 by lowering costs for manufacturers.
In April 2014, Australia had a wheat shortage due to drought conditions, impacting costs for grain-based baby food products (source 2). This aligns with the low revenue on 4/26/2014 as manufacturers likely passed along higher costs to consumers.
The unstructured data sources provided helpful context around supply chain disruptions, ingredient shortages and surpluses, major agricultural events, and changes in trade policies - all of which appear to have impacted baby food sales based on the timing of these events correlating with fluctuations in revenue in the structured data.

Clean-up

To clean up the resources deployed in these instructions, first delete the CloudFormation stack. You can then remove resources from your Amazon Bedrock IDE project and delete domains by following the Amazon SageMaker Unified Studio documentation.

Conclusion

In this post, we demonstrated how Amazon Bedrock IDE transforms generative AI application development from a complex technical endeavor into a straightforward point-and-click experience. While traditional approaches require specialized ML expertise and significant development time, Amazon Bedrock IDE enables users from various skill levels to create production-ready AI applications in hours instead of weeks.

The key benefits are clear: anyone can now build sophisticated generative AI applications without coding expertise, achieve faster time-to-value through pre-built components, and maintain enterprise governance through centralized management. All while having secure access to their organization’s data through a unified, simple-to-use interface. This same approach can be applied beyond sales analytics to other scenarios where teams need to quickly build AI applications that combine enterprise data with large language models – making generative AI truly accessible across your organization.

Ready to transform your organization’s AI capabilities? Start building your first generative AI application today by following our step-by-step guide or visit Amazon Bedrock IDE to explore more solutions for your business needs.


About the Authors

Ameer Hakme is an AWS Solutions Architect based in Pennsylvania. He collaborates with Independent Software Vendors (ISVs) in the Northeast region, assisting them in designing and building scalable and modern platforms on the AWS Cloud. An expert in AI/ML and generative AI, Ameer helps customers unlock the potential of these cutting-edge technologies. In his leisure time, he enjoys riding his motorcycle and spending quality time with his family.

Adam Gamba is a Solutions Architect and Aspiring Analytics & AI/ML Specialist at AWS. With his background in computer science, he is very interested in using technology to build solutions to real-world problems. Originally from New Jersey, but now based in Arlington, Virginia, Adam enjoys rock climbing, playing piano, cooking, and attending local museums and concerts.

Bhaskar Ravat is a Senior Solutions Architect at AWS based in New York, with a deep interest in the transformative potential of AI. My passion lies in exploring how AI can impact both everyday life and the broader human experience. You can find him reading 4 books at a time when not helping or building solutions for customers.

Kosti Vasilakakis is a Principal Product Manager at AWS. He is an ex-data-scientist, turned PM, now leading Amazon Bedrock IDE to help enterprises build high-quality Gen AI applications faster. Kosti remains in awe of the rapid advancements in AI, and is excited to be working on its democratization. Outside of work, you’ll find him coding personal productivity automations, playing tennis, and spending time in the wilderness with his family.