Amazon SageMaker | AWS Big Data Blog

Foundational blocks of Amazon SageMaker Unified Studio: An admin’s guide to implement unified access to all your data, analytics, and AI

In this post, we discuss the foundational building blocks of SageMaker Unified Studio and how, by abstracting complex technical implementations behind user-friendly interfaces, organizations can maintain standardized governance while enabling efficient resource management across business units. This approach provides consistency in infrastructure deployment while providing the flexibility needed for diverse business requirements.

Use DeepSeek with Amazon OpenSearch Service vector database and Amazon SageMaker

OpenSearch Service provides rich capabilities for RAG use cases, as well as vector embedding-powered semantic search. You can use the flexible connector framework and search flow pipelines in OpenSearch to connect to models hosted by DeepSeek, Cohere, and OpenAI, as well as models hosted on Amazon Bedrock and SageMaker. In this post, we build a connection to DeepSeek’s text generation model, supporting a RAG workflow to generate text responses to user queries.

How EUROGATE established a data mesh architecture using Amazon DataZone

In this post, we show you how EUROGATE uses AWS services, including Amazon DataZone, to make data discoverable by data consumers across different business units so that they can innovate faster. Two use cases illustrate how this can be applied for business intelligence (BI) and data science applications, using AWS services such as Amazon Redshift and Amazon SageMaker.

Introducing a new unified data connection experience with Amazon SageMaker Lakehouse unified data connectivity

With Amazon SageMaker Lakehouse unified data connectivity, you can confidently connect, explore, and unlock the full value of your data across AWS services and achieve your business objectives with agility. This post demonstrates how SageMaker Lakehouse unified data connectivity helps your data integration workload by streamlining the establishment and management of connections for various data sources.

An integrated experience for all your data and AI with Amazon SageMaker Unified Studio (preview)

Amazon SageMaker Uniﬁed Studio, in preview, is an integrated development environment (IDE) for data, analytics, and AI. Discover your data and put it to work using familiar AWS tools to complete end-to-end development workflows, including data analysis, data processing, model training, generative AI app building, and more, in a single governed environment. This post demonstrates how SageMaker Unified Studio unifies your analytic workloads.

Simplify data access for your enterprise using Amazon SageMaker Lakehouse

Amazon SageMaker Lakehouse offers a unified solution for enterprise data access, combining data from warehouses and lakes. This post demonstrates how SageMaker Lakehouse integrates scattered data sources, enabling secure enterprise-wide access, and allowing teams to use their preferred tools for predicting and analyzing customer churn. The solution involves multiple data sources, including Amazon S3, Amazon Redshift, and AWS Glue Data Catalog, with AWS Lake Formation managing permissions.

Author visual ETL flows on Amazon SageMaker Unified Studio (preview)

Amazon SageMaker Unified Studio (preview) provides an integrated data and AI development environment within Amazon SageMaker. This post shows how you can build a low-code and no-code (LCNC) visual ETL flow that enables seamless data ingestion and transformation across multiple data sources.

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

AWS has introduced zero-ETL integration support from external applications to AWS Glue, simplifying data integration for organizations. This new feature allows for seamless replication of data from popular platforms like Salesforce, ServiceNow, and Zendesk into Amazon SageMaker Lakehouse and Amazon Redshift. This blog post demonstrates a use case involving ServiceNow data integration, outlining the process of setting up a connector, creating a zero-ETL integration, and verifying both initial data load and change data capture (CDC). It also highlights the advantages of using Apache Iceberg for data versioning and time travel capabilities within zero-ETL integrations.

Catalog and govern Amazon Athena federated queries with Amazon SageMaker Lakehouse

In this post, we show how to connect to, govern, and run federated queries on data stored in Redshift, DynamoDB (Preview), and Snowflake (Preview). To query our data, we use Athena, which is seamlessly integrated with SageMaker Unified Studio. We use SageMaker Lakehouse to present data to end-users as federated catalogs, a new type of catalog object. Finally, we demonstrate how to use column-level security permissions in AWS Lake Formation to give analysts access to the data they need while restricting access to sensitive information.

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

This week on the keynote stages at AWS re:Invent 2024, you heard from Matt Garman, CEO, AWS, and Swami Sivasubramanian, VP of AI and Data, AWS, speak about the next generation of Amazon SageMaker, the center for all of your data, analytics, and AI. This update addresses the evolving relationship between analytics and AI workloads, aiming to streamline how customers work with their data. It helps organizations collaborate more effectively, reduce data silos, and accelerate the development of AI-powered applications while maintaining robust governance and security measures.

Select your cookie preferences

AWS Big Data Blog

Category: Amazon SageMaker