Intermediate (200) | AWS Big Data Blog

Build a multi-tenant healthcare system with Amazon OpenSearch Service

In this post, we address common multi-tenancy challenges and provide actionable solutions for security, tenant isolation, workload management, and cost optimization across diverse healthcare tenants.

Develop and deploy a generative AI application using Amazon SageMaker Unified Studio

In this post, we demonstrate how to use Amazon Bedrock Flows in SageMaker Unified Studio to build a sophisticated generative AI application for financial analysis and investment decision-making.

Build an analytics pipeline that is resilient to Avro schema changes using Amazon Athena

This post demonstrates how to build a solution by combining Amazon Simple Storage Service (Amazon S3) for data storage, AWS Glue Data Catalog for schema management, and Amazon Athena for one-time querying. We’ll focus specifically on handling Avro-formatted data in partitioned S3 buckets, where schemas can change frequently while providing consistent query capabilities across all data regardless of schema versions.

Integrating Amazon OpenSearch Ingestion with Amazon RDS and Amazon Aurora

We are happy to announce the general availability of the integration of Amazon OpenSearch Service with Amazon Relational Database Service (Amazon RDS) and Amazon Aurora. This new integration eliminates complex data pipelines and enables near real-time data synchronization between Amazon Aurora (including Amazon Aurora MySQL-Compatible Edition and Amazon Aurora PostgreSQL-Compatible Edition) and Amazon RDS databases (including Amazon RDS for MySQL and Amazon RDS for PostgreSQL), and Amazon OpenSearch Service, unlocking advanced search capabilities such as hybrid search, ranked results, and faceted search on transactional databases.

Scale your AWS Glue for Apache Spark jobs with R type, G.12X, and G.16X workers

This post demonstrates how AWS Glue R type, G.12X, and G.16X workers help you scale up your AWS Glue for Apache Spark jobs.

Orchestrate data processing jobs, querybooks, and notebooks using visual workflow experience in Amazon SageMaker

Today, we are excited to launch a new visual workflows builder in SageMaker Unified Studio. With the new visual workflow experience, you don’t need to code the Python DAGs manually. Instead, you can visually define the orchestration workflow in SageMaker Unified Studio, and the visual definition is automatically converted to a Python DAG definition that is supported in Airflow.This post demonstrates the new visual workflow experience in SageMaker Unified Studio.

Near real-time baggage operational insights for airlines using Amazon Kinesis Data Streams

This post explores a framework developed by IBM to modernize baggage analytics using AWS managed services like Amazon Kinesis Data Streams, DynamoDB Streams, and other AWS services within a serverless architecture. The solution enables near real-time baggage operational insights for airlines, delivering cost savings, enhanced scalability, and improved performance while providing better security and operational efficiency to meet evolving airline needs.

Overcome your Kafka Connect challenges with Amazon Data Firehose

We’re happy to announce a new feature in the Amazon Data Firehose integration with Amazon MSK. You can now specify the Firehose stream to either read from the earliest position on the Kafka topic or from a custom timestamp to begin reading from your MSK topic. In this post of this series, we focus on managed data delivery from Kafka to your data lake.

Build conversational AI search with Amazon OpenSearch Service

Amazon OpenSearch Service is a versatile search and analytics tool. In this post, we explore conversational search, its architecture, and various ways to implement it.

Enhance data ingestion performance in Amazon Redshift with concurrent inserts

Amazon Redshift employs columnar storage for database tables, reducing overall disk I/O requirements. This storage method significantly improves analytic query performance by minimizing data read during queries. This post showcases the key improvements in Amazon Redshift concurrent data ingestion operations.

AWS Big Data Blog

Category: Intermediate (200)