AWS Big Data Blog
Streamline AI-driven analytics with governance: Integrating Tableau with Amazon DataZone
Amazon DataZone is a data management service that makes it faster and easier for customers to catalog, discover, share, and govern data stored across AWS, on premises, and from third-party sources. Amazon DataZone recently announced the expansion of data analysis and visualization options for your project-subscribed data within Amazon DataZone using the Amazon Athena JDBC driver.
Collaborating closely with our partners, we have tested and validated Amazon DataZone authentication via the Athena JDBC connection, providing an intuitive and secure connection experience for users. With this integration, you can now seamlessly query your governed data lake assets in Amazon DataZone using popular business intelligence (BI) and analytics tools, including partner solutions like Tableau.
Ali Tore, Senior Vice President of Advanced Analytics at Salesforce, highlighting the value of this integration, says
“We’re excited to partner with Amazon to bring Tableau’s powerful data exploration and AI-driven analytics capabilities to customers managing data across organizational boundaries with Amazon DataZone. This integration enables our customers to seamlessly explore data with AI in Tableau, build visualizations, and uncover insights hidden in their governed data, all while leveraging Amazon DataZone to catalog, discover, share, and govern data across AWS, on premises, and from third-party sources—enhancing both governance and decision-making.”
With this launch, Amazon DataZone strengthens its commitment to empowering enterprise customers with secure, governed access to data across the tools and platforms they rely on. For example, Guardant Health uses Amazon DataZone to democratize data access across its organization, enabling diverse teams to efficiently access, query, and analyze data tailored to their specific needs.
Rajesh Kucharlapati, Senior Director of Data, CRM, and Analytics at Guardant Health, says
“By harmonizing data across multiple business domains, we foster a culture of data sharing. Using Amazon DataZone lets us avoid building and maintaining an in-house platform, allowing our developers to focus on tailored solutions. Leveraging AWS’s managed service was crucial for us to access business insights faster, apply standardized data definitions, and tap into generative AI potential. We also needed an easy connection process for widely-used analytics tools like Tableau, DBeaver, and Domino, directly within Amazon DataZone projects. This new JDBC connectivity feature enables our governed data to flow seamlessly into these tools, supporting productivity across our teams.”
Use case
Amazon DataZone addresses your data sharing challenges and optimizes data availability. Here’s how:
- Data product creation – As a data producer, you can create and catalog data products while enforcing governance, making your data findable, accessible, interoperable, and reusable (FAIR).
- Streamlined access – As a data consumer, you can easily locate and subscribe to data from multiple sources within a single project. You can analyze this data using a variety of tools, including built-in AWS options such as Amazon Athena, Amazon Redshift, and Amazon SageMaker.
- Integration with partner tools – The addition of support for partner analytics tools offers you greater flexibility and efficiency in your workflows. You can now use your tool of choice, including Tableau, to quickly derive business insights from your data while using standardized definitions and decentralized ownership. Refer to the detailed blog post on how you can use this to connect through various other tools.
Prerequisites
To get started, complete these steps:
- Download and install the latest Athena JDBC driver for Tableau.
- Copy the JDBC connection string from the Amazon DataZone portal into the JDBC connection configuration to establish a connection from Tableau. This will direct you to authenticate using single sign-on with your corporate credentials.
When you’re connected, you can query, visualize, and share data—governed by Amazon DataZone—within Tableau.
The following diagram shows the high-level architecture of the Tableau integration.
Solution walkthrough: Configure Tableau to access project-subscribed data assets
To configure Tableau to access project-subscribed data assets, follow these detailed steps:
- Download the latest Athena driver. If Tableau has the Athena driver preinstalled, it could be the older (v2) version. To confirm compatibility with Amazon DataZone, you’ll need the latest (v3) driver that includes the necessary authentication features. To download the latest JDBC driver version x, visit Athena JDBC 3.x driver.
- Install the driver. Copy the JDBC driver file to the appropriate folder for your operating system:
- For macOS:
~/Library/Tableau/Drivers
- For Windows:
C:\Program Files\Tableau\Drivers
- For macOS:
- On the Amazon DataZone console, select your project, as shown in the following screenshot of DataZone Console.
- To capture the JDBC connection parameters, follow these steps:
- On the project page, review the connection options under ANALYTICS TOOLS. Choose Connect with JDBC.
- In the JDBC parameters dialog box, select Using IDC auth and copy the JDBC URL. Optionally, you can use Using IAM auth to connect with your Amazon DataZone project as an AWS Identity and Access Management (IAM) role (from a server), provided that you are added as a project member within that project. The following screenshot shows the dialog box.
- On the project page, review the connection options under ANALYTICS TOOLS. Choose Connect with JDBC.
- To configure the Tableau desktop for connection, follow these steps:
- On the To a Server connection menu, select Other Databases (JDBC).
- Paste the copied JDBC URL into the URL field, leaving the other fields (Dialect, Username, Password) unchanged.
- On the To a Server connection menu, select Other Databases (JDBC).
- To sign in with single sign-on, choose Sign in, as shown in the following screenshot. You’ll be redirected to authenticate with AWS IAM Identity Center. Use the credentials for your AWS single sign-on account.
- After you’re signed in, you’ll be prompted to authorize the
DataZoneAuthPlugin
. Choose Allow access to authorize access to Amazon DataZone from Tableau, as shown in the following screenshot.
- After the connection is established, a success message will appear, as shown in the following screenshot.
You can now view your project’s subscribed data directly within Tableau and build dashboards.
Conclusion
Amazon DataZone continues to expand its offerings, providing you with more flexibility in how you access, analyze, and visualize your subscribed data. With support for the Athena JDBC driver, you can now use a wide range of popular BI and analytics tools including Tableau, making governed data within Amazon DataZone more accessible than ever before.
In this post, you learned how the recent enhancements in Amazon DataZone facilitate a seamless connection with Tableau. By integrating Tableau with the comprehensive data governance capabilities of Amazon DataZone, we’re empowering data consumers to quickly and seamlessly explore and analyze their governed data. This integration helps organizations break down silos, foster collaboration, and make informed decisions, all while maintaining the security and control needed in today’s complex, distributed data landscape.
The feature is supported in all AWS commercial Regions where Amazon DataZone is currently available. Check out the video below and the detailed blog post to learn how to connect Amazon DataZone to external analytics tools via JDBC. Get started with our technical documentation.
Related blog posts
- Expanding data analysis and visualization options: Amazon DataZone now integrates with Tableau, Power BI, and more
- AI-Driven Analytics on AWS Using Tableau and Amazon SageMaker
About the Authors
Ramesh H Singh is a Senior Product Manager Technical (External Services) at AWS in Seattle, Washington, currently with the Amazon DataZone team. He is passionate about building high-performance ML/AI and analytics products that enable enterprise customers to achieve their critical goals using cutting-edge technology. Connect with him on LinkedIn.
Adiascar Cisneros is a Tableau Senior Product Manager based in Atlanta, GA. He focuses on the integration of the Tableau Platform with AWS services to amplify the value users get from our products and accelerate their journey to valuable, actionable insights. His background includes analytics, infrastructure, network security, and migrations. Follow him on LinkedIn.
Joel Farvault is Principal Specialist SA Analytics for AWS with 25 years’ experience working on enterprise architecture, data governance and analytics, mainly in the financial services industry. Joel has led data transformation projects on fraud analytics, claims automation, and Master Data Management. He leverages his experience to advise customers on their data strategy and technology foundations.
Yogesh Dhimate is a Sr. Partner Solutions Architect at AWS, leading technology partnership with Tableau. Prior to joining AWS, Yogesh worked with leading companies including Salesforce driving their industry solution initiatives. With over 20 years of experience in product management and solutions architecture Yogesh brings unique perspective in cloud computing and artificial intelligence.
Ariana Rahgozar is a Sr. Senior Solutions Architect at AWS, leading customers design and implement technical solutions as part of their cloud journey.