AWS Big Data Blog

Category: Amazon Athena

Introducing a new unified data connection experience with Amazon SageMaker Lakehouse unified data connectivity

With Amazon SageMaker Lakehouse unified data connectivity, you can confidently connect, explore, and unlock the full value of your data across AWS services and achieve your business objectives with agility. This post demonstrates how SageMaker Lakehouse unified data connectivity helps your data integration workload by streamlining the establishment and management of connections for various data sources.

Building end-to-end data lineage for one-time and complex queries using Amazon Athena, Amazon Redshift, Amazon Neptune and dbt

In this post, we use dbt for data modeling on both Amazon Athena and Amazon Redshift. dbt on Athena supports real-time queries, while dbt on Amazon Redshift handles complex queries, unifying the development language and significantly reducing the technical learning curve. Using a single dbt modeling language not only simplifies the development process but also automatically generates consistent data lineage information. This approach offers robust adaptability, easily accommodating changes in data structures.

Catalog and govern Amazon Athena federated queries with Amazon SageMaker Lakehouse

In this post, we show how to connect to, govern, and run federated queries on data stored in Redshift, DynamoDB (Preview), and Snowflake (Preview). To query our data, we use Athena, which is seamlessly integrated with SageMaker Unified Studio. We use SageMaker Lakehouse to present data to end-users as federated catalogs, a new type of catalog object. Finally, we demonstrate how to use column-level security permissions in AWS Lake Formation to give analysts access to the data they need while restricting access to sensitive information.

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

ANZ Institutional Division has transformed its data management approach by implementing a federated data platform based on data mesh principles. This shift aims to unlock untapped data potential, improve operational efficiency, and increase agility. The new strategy empowers domain teams to create and manage their own data products, treating data as a valuable asset rather than a byproduct. This post explores how the shift to a data product mindset is being implemented, the challenges faced, and the early wins that are shaping the future of data management in the Institutional Division.

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

We are excited to announce that the dbt adapter for Amazon Athena is now officially supported in dbt Cloud. This integration enables data teams to efficiently transform and manage data using Athena with dbt Cloud’s robust features, enhancing the overall data workflow experience. In this post, we discuss the advantages of dbt Cloud over dbt Core, common use cases, and how to get started with Amazon Athena using the dbt adapter.

Streamline AI-driven analytics with governance: Integrating Tableau with Amazon DataZone

Amazon DataZone recently announced the expansion of data analysis and visualization options for your project-subscribed data within Amazon DataZone using the Amazon Athena JDBC driver. In this post, you learn how the recent enhancements in Amazon DataZone facilitate a seamless connection with Tableau. By integrating Tableau with the comprehensive data governance capabilities of Amazon DataZone, we’re empowering data consumers to quickly and seamlessly explore and analyze their governed data.

Expanding data analysis and visualization options: Amazon DataZone now integrates with Tableau, Power BI, and more

Amazon DataZone now launched authentication support through the  Amazon Athena JDBC driver, allowing data users to seamlessly query their subscribed data lake assets via popular business intelligence (BI) and analytics tools like Tableau, Power BI, Excel, SQL Workbench, DBeaver, and more. This integration empowers data users to access and analyze governed data within Amazon DataZone using familiar tools, boosting both productivity and flexibility.

Analyze Amazon EMR on Amazon EC2 cluster usage with Amazon Athena and Amazon QuickSight

In this post, we guide you through deploying a comprehensive solution in your Amazon Web Services (AWS) environment to analyze Amazon EMR on EC2 cluster usage. By using this solution, you will gain a deep understanding of resource consumption and associated costs of individual applications running on your EMR cluster.

High level architecture of the Estimations system using Athena

How AppsFlyer modernized their interactive workload by moving to Amazon Athena and saved 80% of costs

AppsFlyer develops a leading measurement solution focused on privacy, which enables marketers to gauge the effectiveness of their marketing activities and integrates them with the broader marketing world, managing a vast volume of 100 billion events every day. This post explores how AppsFlyer modernized their Audiences Segmentation product by using Amazon Athena.