AWS Big Data Blog
Integrate Amazon Redshift native IdP federation with Microsoft Azure AD and Power BI
June 2023: This post was reviewed and updated for accuracy.
Amazon Redshift accelerates your time to insights with fast, easy, and secure cloud data warehousing at scale. Tens of thousands of customers rely on Amazon Redshift to analyze exabytes of data and run complex analytical queries.
As enterprise customers look to build their data warehouse on Amazon Redshift, they have many integration needs with the business intelligence (BI) tools they’re using. For customers who want to integrate Amazon Redshift with their existing identity provider (IdP) such as Microsoft Azure Active Directory (Azure AD) using BI tools and services such as Power BI Desktop and Power BI service, we have introduced a native IdP for Amazon Redshift to help you implement authentication and authorization for these tools in a seamless way.
Amazon Redshift native IdP simplifies the administration process of managing identities and permission. This feature provides native integration with Microsoft Azure AD, which you can use for authentication and authorization with tools like Power BI. It uses your existing IdP to simplify authentication and managing permissions. It does this by making it possible to share identity metadata to Amazon Redshift from your IdP. In this approach, an external IdP (such as Azure AD) issues an access token, which is passed to Amazon Redshift via a client, and then Amazon Redshift performs the token validation and claim extraction natively.
By leveraging Amazon Redshift native IdP authentication, users can establish a robust and secure end-to-end experience within the platform. This authentication method simplifies administration tasks while providing the necessary tools to implement fine-grained row-level security in Amazon Redshift. Additionally, Redshift Native IdP can be integrated with dynamic data masking (DDM) making sensitive data protection becomes more streamlined and efficient, further enhancing the security measures within the Amazon Redshift data warehouse.
This post shows a step-by-step implementation of the Amazon Redshift native IdP setup with Azure AD, which demonstrates how to manage users and groups with an organizational directory, and how to federate into Amazon Redshift. You don’t need to create AWS Identity and Access Management (IAM) roles, policies, separate database users, or groups in Amazon Redshift with this setup.
Solution overview
Using an Amazon Redshift native IdP has the following benefits:
- You can manage users and groups from a centralized IdP
- Enables your users to be automatically signed in to Amazon Redshift with their Azure AD accounts
- You can automatically create Amazon Redshift roles with a namespace that represents external groups (such as Azure AD groups)
- External user group membership is natively mirrored with Amazon Redshift roles and users
The general configuration steps of the Amazon Redshift native IdP approach are as follows:
- Register an application in Azure AD and set up groups.
- Collect Azure AD information for the Amazon Redshift IdP.
- Set up the IdP on Amazon Redshift.
- Set up Amazon Redshift permissions to external identities.
- Configure the client connection.
The following diagram illustrates the resulting solution.
To get authorized, the Power BI client sends an authentication request to the Azure enterprise application using Azure AD credentials. After verification, Azure sends a JSON web token (OAuth token) to the Power BI application. The Power BI application forwards the connection string with the OAuth token to Amazon Redshift. Amazon Redshift parses and validates the token, and requests group information from Azure AD. Upon reception, Amazon Redshift automatically creates the user and roles, and does the respective mapping.
Prerequisites
You need the following prerequisites to set up this solution:
- A Microsoft Azure account that has an active subscription. You need to an admin role to set up the application on Azure AD.
- Power BI Desktop version 2.102.683.0 64-bit and above downloaded and installed. In this example, we have use a Windows environment.
- The latest version of the Microsoft Enterprise/Standard Gateway installed.
- An AWS account with an Amazon Redshift cluster. In this post, we connect Power BI Desktop and service with a publicly accessible Amazon Redshift cluster.
Register an application in Azure AD and set up groups
To set up the Azure application and group permission, complete the following steps:
- Sign in to the Azure portal with your Microsoft account.
- Navigate to the Azure Active Directory application.
- Under Manage, choose App registrations and New registration.
- For Name, enter an application name (for example,
nativeoauthsetup
). - Keep the default settings for the rest of the fields.
- Choose Register to complete the initial application registration.
- On the newly created application Overview page, locate the client ID and tenant ID and note down these IDs in order to register the IdP in Amazon Redshift later.
- Under Manage in the navigation pane, choose API permissions.
- Choose Add a permission.
- Choose Microsoft Graph and then choose Application permissions.
- Search for
directory
and select theDirectory.Read.All
permission. - Choose Add permissions.
- Choose Grant admin consent.
- In the popup box, choose Yes to grant the admin consent.
The status of the permission shows Granted for with a green check mark.
- Under Manage in the navigation pane, choose Certificates & secrets.
- Choose Client secrets and choose New client secret.
- Enter a Description, select an expiration for the secret or specify a custom lifetime. We are keeping Microsoft recommended default expiration value of 6 months. Choose Add.
- Copy the secret value.
It would only be present one time and after that you cannot read it.
- On the Azure AD home page, under Manage in the navigation pane, choose Groups.
- Choose New group.
- In the New Group section, provide the required information.
- Choose No members selected and then search for the members.
- Select your members and choose Select. For this example, you can search your username and click select.
You can see the number of members in the Members section.
Collect Azure AD Information for Amazon Redshift IdP
Before we collect the Azure AD information, we need to identify the access token version from the application which you have created earlier. In the navigation pane, under Manage, choose Manifest section, then view the accessTokenAcceptedVersion
parameter: null
and 1
indicate v1.0 tokens, and 2
indicates v2.0 tokens.
To configure your IdP in Amazon Redshift, collect the following parameters from Azure AD. If don’t have these parameters, contact your Azure admin.
- issuer – This is known as <Microsoft_Azure_issuer_value>. If you’re using the v1.0 token, use
https://sts.windows.net/<Microsoft_Azure_tenantid_value>/
. Currently, Power BI only uses v1.0 token. If you’re using the v2.0 token, usehttps://login.microsoftonline.com/<Microsoft_Azure_tenantid_value>/v2.0
. To find your Microsoft Azure tenant ID, complete the following steps:- Sign in to the Azure portal with your Microsoft account.
- Under Manage, choose App registrations.
- Choose the Amazon Redshift application you created earlier.
- Click on the Overview (left panel) page and under Essentials, note down the values.
- client_id – This is known as <Microsoft_Azure_clientid_value> in the following sections. An example of a client ID is
5ab12345-1234-1a12-123a-11abc1a12ab1
). To get your client ID value, locate the Amazon Redshift application you created earlier on the Azure portal; it’s listed in the Essentials section. - client_secret – This is known as <Microsoft_Azure_client_secret_value> in the following sections. An example of a client secret value is
KiG7Q~FEDnE.VsWS1IIl7LV1R2BtA4qVv2ixB
). To create your client secret value, refer to the steps in the previous section. - audience – This is known as <Microsoft_Azure_token_audience_value> in the following sections. With Power BI Desktop, you need to set the audience value as
https://analysis.windows.net/powerbi/connector/AmazonRedshift
.
Set up the IdP on Amazon Redshift
To set up the IdP on Amazon Redshift, complete the following steps:
- Log in to Amazon Redshift with a superuser user name and password using query editor v2 or any SQL client.
- Run the following SQL:
In our example, we use the v1.0 token issuer because as of this writing, Power BI only uses the v1.0 token:
- To alter the IdP, use the following command (this new set of parameter values completely replaces the current values):
- To view a single registered IdP in the cluster, use the following code:
- To view all registered IdPs in the cluster, use the following code:
- To drop the IdP, use the following command:
Set up Amazon Redshift permissions to external identities
The users, roles, and role assignments are automatically created in your Amazon Redshift cluster during the first login using your native IdP unless they were manually created earlier.
Create and assign permission to Amazon Redshift roles
In this step, we create a role in the Amazon Redshift cluster based on the groups that you created on the Azure AD portal.
The role name in the Amazon Redshift cluster looks like <namespace>:<azure_ad_group_name>
, where the namespace is the one we provided in the IdP creation command and the group name is the one we specified when we were setting up the Azure application. In our example, it’s aad:rsgroup
.
Run the following command in the Amazon Redshift cluster:
For example:
To grant permission to the Amazon Redshift role, enter the following command:
Then grant relevant permission to the role as per your requirement. For example:
Create and assign permission to an Amazon Redshift user
This step is only required if you want to grant permission to an Amazon Redshift user instead of roles. We create an Amazon Redshift user that maps to a Azure AD user and then grant permission to it. If you don’t want to explicitly assign permission to an Amazon Redshift user, you can skip this step.
To create the user, use the following syntax:
For example:
We use the following syntax to grant permission to the Amazon Redshift user:
For example:
Configure your client connection using an Amazon Redshift native IdP
In this section, we provide instructions to set up your client connection for either Power BI Desktop or the Power BI service.
Connect Power BI Desktop
In this example, we use Power BI Desktop to connect with Amazon Redshift using a native IdP. Use Power BI Desktop version: 2.102.683.0 64-bit and above.
- In your Power BI Desktop, choose Get data.
- Search for the Amazon Redshift connector, then choose it and choose Connect.
- For Server, enter your Amazon Redshift cluster’s endpoint. For example,
test-cluster.ct4abcufthff.us-east-1.redshift.amazonaws.com
. - For Database, enter your database name. In this example, we use dev.
- Choose OK.
- Choose Microsoft Account.
- Choose Sign in.
- Enter your Microsoft Account credentials.
When you’re connected, you can see the message You are currently signed in.
Congratulations! You are signed in using the Amazon Redshift native IdP with Power BI Desktop. Now you can browse your data.
After that, you can create your own Power BI report on the desktop version and publish it to your Microsoft account. For this example, we created and published a report named RedshiftOAuthReport
, which I refer to later in this post.
Connect Power BI service
Now, let’s connect a Power BI gateway with Amazon Redshift using a native IdP. Before proceeding with below setup, please make sure you have downloaded and installed the latest version of the Microsoft Enterprise/Standard Gateway.
- Open the Power BI web application and sign in if necessary.
You can see the RedshiftOAuthReport
report that we created earlier.
- In the navigation pane, under Datasets, choose the menu icon (three dots) next to the report name and then choose Settings.
- Enable Gateway connection on the settings page.
- Click on the arrow on right side and select Manually add to gateway.
- In the Data Source Settings section, enter the appropriate values:
- For Data Source Name, enter a name.
- For Data Source Type, choose Amazon Redshift.
- For Server, enter your Amazon Redshift cluster’s endpoint.
- For Database, enter your database name (for this post, we use
dev
). - For Authentication Method, choose OAuth2.
- Choose Edit credentials.
- In the pop-up box, choose Sign in.
- Enter your Microsoft account credentials and follow the authentication process.
- After the authentication, choose Add on the Data Source Settings page.
- Make sure that Gateway connection is enabled. If not, enable it.
- Select your gateway from the gateway list.
- On the Maps to menu, choose your data source.
- Choose Apply.
Congratulations! You have completed the Amazon Redshift native IdP setup with Power BI web service.
Best Practices with Redshift native IdP:
- Pre-create the Amazon Redshift roles based upon the groups which you have created on the Azure AD portal.
- Assign permissions to Redshift roles instead of assigning to each individual external user. This will provide smoother end user experience as user will have all the required permission when they login using native IdP.
- A federated external user’s role membership is only available in that user’s session. When a federated external user creates any view or stored procedure, this user cannot delegate permission of those objects to other users/roles. Therefore, it is recommended to create objects using non-federated user and then grant the privileges to respective user or group.
Troubleshooting
If your connection didn’t work, consider the following:
- Enable logging in the driver. For instructions, see Configure logging.
- Make sure to use the latest Amazon Redshift JDBC driver version 2.1.0.4 onwards, which supports Amazon Redshift native IdP authentication.
- If you’re getting errors while setting up the application on Azure AD, make sure you have admin access.
- If you can authenticate via the SQL client but get a permission issue or can’t see objects, grant the relevant permission to the role, as detailed earlier in this post.
- If you get the error “claim value does not match expected value,” make sure you provided the correct parameters during Amazon Redshift IdP registration.
- Check
stl_error
orstl_connection_log
views on the Amazon Redshift cluster for authentication failures.
Summary
In this post, we covered the step-by-step process of integrating Amazon Redshift with Azure AD and Power BI Desktop and web service using Amazon Redshift native IdP federation. The process consisted of registering a Azure application, creating Azure AD groups, setting up the Amazon Redshift IdP, creating and assigning permission to Amazon Redshift roles, and finally configuring client connections.
For more information about Amazon Redshift native IdP federation, see:
- AWS Documentation: Native identity provider (IdP) federation for Amazon Redshift.
- AWS Blog to connect with SQL Client: Integrate Amazon Redshift native IdP federation with Microsoft Azure AD using a SQL client.
If you have questions or suggestions, please leave a comment.
About the Authors
Maneesh Sharma is a Senior Database Engineer at AWS with more than a decade of experience designing and implementing large-scale data warehouse and analytics solutions. He collaborates with various Amazon Redshift Partners and customers to drive better integration.
Ilesh Garish is a Software Development Engineer at AWS. His role is to develop connectors for Amazon Redshift. Prior to AWS, he built database drivers for the Oracle RDBMS, TigerLogic XDMS, and OpenAccess SDK. He also worked in the database internal technologies at San Francisco Bay Area startups.
Debu Panda is a Senior Manager, Product Management at AWS. He is an industry leader in analytics, application platform, and database technologies, and has more than 25 years of experience in the IT world.
Sergey Konoplev is a Senior Database Engineer on the amazon Redshift Team at AWS. Sergey has been focusing on Automation and improvement of database and data operations for more than a decade.