AWS for Industries
How to enable individual user’s Microsoft Power BI access to Amazon Redshift using ODBC Driver, Keycloak & SAML 2.0
Xylem (NYSE: XYL) is a leading global water technology company committed to solving critical water and infrastructure challenges with innovation. One of those innovative solutions is the Sensus Utility Data Lake, where customers can capture data from their smart meters and use self-service/low code tooling to draw novel analytic insights from their data. The main benefit is to give customers the analytical abilities to satisfy their regional specific analysis and regulatory needs, which differ from the greater population of utilities served by the existing product portfolio.
Xylem built Sensus Utility Data Lake on AWS and stored the utilities’ data in Amazon Redshift. Customers access and analyze the data in Amazon Redshift using Power BI. To secure and protect users’ data, we integrated Single-Sign-On (SSO) credentialing directly to Amazon Redshift clusters to facilitate individual user connections via an Identity Provider.
This post demonstrates how Xylem and AWS partners worked together to enable secure connectivity using Amazon Redshift, AWS Identity and Access Management (IAM), Keycloak and Power BI. We are also explaining how to overcome the challenges associated with the ODBC driver and its support for OpenID Connect (OIDC) using SAML based assertion and OIDC tokens to establish the connectivity between Power BI and Amazon Redshift using Amazon Redshift ODBC Driver.
Key components
Let’s review the concepts of the key components used in the solution.
Amazon Redshift is a fully-managed petabyte-scale cloud data warehouse service that uses SQL to analyze the structure, semi-structured data across databases, and data lakes, as well as provides a secure and reliable analytics service. Amazon Redshift also offers a serverless option that automatically provisions and scales data warehouse capacity to meet the requirements for demanding and unpredictable workloads.
IAM is a web service that secures and controls access to the resources in AWS. It also handles the authentication and authorization to use resources. It helps administrators manage permissions in a central location and control users’ accessibility to AWS resources. IAM also has the SAML 2.0 (Security Assertion Markup Language 2.0) identity provider (IdP) entity which supports SAML 2.0 standard.
Keycloak is an open-source IAM tool, it’s based on standards such as SAML 2.0, OIDC, and OAuth2.0. Keycloak provides SSO, User Federation, Identity Brokering, Social Login, etc.
Power BI is a data visualization tool developed by Microsoft that offers enterprise business intelligence (BI) capabilities. You can use Power BI to perform ad-hoc query analysis, data visualization, and create dashboards. Moreover, refer to Amazon QuickSight, a fast cloud-based business analytics service that is serverless and requires no up-front investment in expensive hardware and software. QuickSight provides quick insights from disparate data sources, such as on-premises databases and SaaS applications like Salesforce, and seamlessly integrates with Amazon Redshift,
Solution overview
Amazon Redshift provides Java Database Connectivity (JDBC), Python, and Open Database Connectivity (ODBC) drivers to connect from various SQL clients. The drivers provided by AWS support IAM, SSO, and federated authentication.
For secure access, Xylem has implemented SSO using Keycloak, an open authentication protocol that is an extension of OAuth 2.0. In OIDC, the credentials are never shared with any applications and OIDC tokens are exchanged between the applications for authentication and authorization.
In this solution, we access Amazon Redshift from the Microsoft Power BI Desktop. Power BI isn’t compatible with JDBC drivers, so we use Amazon Redshift ODBC Driver for this integration. ODBC drivers don’t support OIDC authentication. Given this restriction of the ODBC driver with OIDC, we use the (i.e.) “Identity Provider: Browser SAML” option. ODBC driver supports SAML-based authentication and Keycloak has the capabilities to process SAML based assertion.
Enterprise-wide, Xylem uses OIDC, which is implemented via Keycloak. To connect Power BI Desktop with Amazon Redshift, we use SAML-based authentication instead of OIDC due to the restrictions in the ODBC driver. Keycloak supports both OIDC and SAML 2.0, so OIDC to SAML conversion and vice versa are possible. We enabled our users to log in via SSO using the OIDC protocol.
Prerequisites
The following are the prerequisites to implement this solution:
- Amazon Redshift ODBC Driver v1.4.49.100
- Knowledge of IAM, Keycloak, OIDC, and SAML 2.0
- Microsoft Power Desktop version 2.100.1381.0 64-bit
- Keycloak version 16.1.1
- A new realm in your Keycloak instance configured with an existing OIDC based client
Solution walkthrough
Let’s review the interaction between the various actors in this solution.
- A business user launches their Power BI desktop session.
- Power BI desktop uses the Amazon Redshift ODBC driver that is installed on the user’s workstation.
- Amazon Redshift ODBC Driver is configured to use SAML-based authentication and it interacts with the SAML client by sharing a SAML token.
- Keycloak’s SAML client relays the SAML token to Keycloak’s OIDC provider for authentication.
- Keycloak SAML client converts OIDC token to SAML-based assertion and completes the authentication using IAM SAML identity provider.
- Using SAML tokens, the ODBC driver establishes a connection with Amazon Redshift.
- Now, let’s walk through the steps involved in setting up Keycloak, IAM, Amazon Redshift ODBC Driver, and Power BI Desktop.
Before we begin, we highly recommend that you create a new realm in Keycloak to implement this solution so that your existing realm stays intact. Refer to Keycloak documentation for steps to create a new realm.
Create OIDC IdP
First, let’s assign a name to your new realm, say “awsDemo,” and provide the “Frontend URL.”
Now let’s create an OIDC-based IdP within the “awsDemo” realm by providing an Alias, a Display Name, thereby enabling the Store Tokens, setting the First Login Flow and Sync Mode.
Set the Sync mode to “force” to make sure that the mappers for users are updated on each successful login, in case of changes in the user’s level of access.
Parameters such as Authorization URL, Token URL, Client ID, and Client Secret are set based on the original OIDC and set Client Authentication to “Client secret sent as post.”
Now, we must set up the mappers to extract values from the OIDC token that will be sent via SAML assertion to AWS. For this purpose, we create two mappers, namely “DbGroups” and “roleArn,” based on the values from the following tables:
Attribute Name | Attribute Value |
ID | <<Autogenerated>> |
Name | roleArn |
Sync Mode Override | Inherit |
Mapper Type | Attribute Importer |
Claim | {name of claim to search for in token} |
User Attribute Name | roleArn |
The roleArn is the AWS role that is assumed by the user and is synced up into a string with AWS SAML IdP.
Attribute Name | Attribute Value |
ID | <<Autogenerated>> |
Name | DbGroups |
Sync Mode Override | Inherit |
Mapper Type | Attribute Importer |
Claim | DbGroups |
User Attribute Name | redshiftGroup |
Create SAML Client
Now let’s create a new SAML client by importing the AWS-provided SAML metadata found at https://signin.thinkwithwp.com/static/saml-metadata.xml and provide values for “Valid Redirect URLs,” “IDP Initiated SSO URL Name” as specified in the following illustration:
After the previous step, the “Target IDP initiated SSO URL” and “Assertion Consumer Service POST Binding URL” are generated in the following format:
Attribute Name | Attribute Value |
Target IDP initiated SSO URL | {Frontend URL login}/realms/{realm name}/protocol/saml/clients/{IDP Initiated SSO URL Name} |
Assertion Consumer Service POST Binding URL | http://localhost:7890/redshift/ |
In SAML Client, the values in the OIDC token are parsed and stored in the mappers. For this purpose, we must define the mappers for Role, Role Session Name, DB User, and DbGroups. Refer to the following to create the mappers:
Attribute Name | Attribute Value |
ID | <<Autogenerated>> |
Name | Role |
Friendly Name | Role |
Mapper Type | User Attribute |
User Attribute | roleArn |
SAML Attribute Name | https://thinkwithwp.com/SAML/Attributes/Role |
SAML Attribute Name Format | Basic |
Attribute Name | Attribute Value |
ID | <<Autogenerated>> |
Name | RoleSessionName |
Friendly Name | RoleSessionName |
Mapper Type | User Attribute |
User Attribute | username |
SAML Attribute Name | https://thinkwithwp.com/SAML/Attributes/RoleSessionName |
SAML Attribute Name Format | Basic |
Attribute Name | Attribute Value |
ID | <<Autogenerated>> |
Name | DbUser |
Friendly Name | DbUser |
Mapper Type | User Attribute |
User Attribute | username |
SAML Attribute Name | https://redshift.amazon.com/SAML/Attributes/DbUser |
SAML Attribute Name Format | Basic |
Attribute Name | Attribute Value |
ID | <<Autogenerated>> |
Name | DbGroups |
Friendly Name | DbGroups |
Mapper Type | User Attribute |
User Attribute | redshiftGroup |
SAML Attribute Name | https://redshift.amazon.com/SAML/Attributes/DbGroups |
SAML Attribute Name Format | Basic |
Finally, create a mapper named “AutoCreate” to generate a temporary user within Amazon Redshift with the naming convention “IAM: {DbUser}/”
Create IAM SAML Identity Provider
The verified user identity from Keycloak is sent to AWS. For this purpose, we must configure a SAML provider in IAM using the SAML document downloaded from Keycloak. To create an IAM SAML identity provider, follow the steps in this documentation.
Now let’s create a role and policy using the following code snippet. The temporary user will assume this role. Refer to this documentation for creating an IAM role.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "{arn of your AWS SAML IdP}"
},
"Action": "sts:AssumeRoleWithSAML",
"Condition": {
"StringEquals": {
"SAML:aud": "http://localhost:7890/redshift/",
"SAML:iss": "{Front End URL}/realms/{realm name}"
}
}
}
]
}
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "redshiftCredentials",
"Effect": "Allow",
"Action": "redshift:GetClusterCredentials",
"Resource": [
"arn:aws:redshift:*:*:dbname:{redshift cluster name}/{database name}",
"arn:aws:redshift:*:*:dbuser:{redshift cluster name}/${redshift:DbUser}",
"arn:aws:redshift:*:*:cluster:{redshift cluster name}",
],
"Condition": {
"StringLike": {
"aws:userid": "*:${redshift:DbUser}"
}
}
},
{
"Sid": "redshiftUserCreation",
"Effect": "Allow",
"Action": "redshift:CreateClusterUser",
"Resource": [
"arn:aws:redshift:*:*:dbuser:{redshift cluster name}/${redshift:DbUser}"
]
},
{
"Sid": "redshiftJoinGroup",
"Effect": "Allow",
"Action": "redshift:JoinGroup",
"Resource": [
"arn:aws:redshift:*:*:dbgroup:{redshift cluster name}/{name of redshift group}"
]
},
{
"Sid": "redshiftCommonActions",
"Effect": "Allow",
"Action": [
"redshift:ListSchemas",
"redshift:DescribeQuery",
"redshift:ListDatabases",
"redshift:CreateClusterUser",
"redshift:DescribeClusters",
"redshift:ExecuteQuery",
"redshift:FetchResults",
"redshift:DescribeTable",
"redshift:ListTables",
"redshift:CancelQuery"
],
"Resource": "*"
}
]
}
In the IAM policy, you must provide your database name, cluster identifier, and Amazon Redshift group that the user will join. AWS will apply the temporary “DbUser” in the sections {redshift: DbUser}.
Install ODBC Driver and Setup DSN
Now you must install the Amazon Redshift ODBC driver on the user’s computer to access Amazon Redshift. Follow the instructions in this documentation to install the ODBC driver. Let’s set the “Pool Connections to this driver” after successfully installing the ODBC driver.
In the ODBC driver settings, let’s navigate to the Connection Pooling and enable “Pool Connections to this driver” for 60 seconds.
Let’s create an ODBC Data Source Name (DSN). In our case, we create a DSN for Amazon Redshift, so you should have the server, port, and database information handy. For detailed instructions on creating a DSN, refer to this documentation. As this ODBC driver will use SAML based authentication, select the auth type in the DSN as “Identity Provider: Browser SAML”, and provide the Login URL port from your Keycloak instance and set the timeout as 60 seconds. After entering the information, your configuration should look something like the following:
Test the DSN connection to Amazon Redshift before selecting the “OK” button.
Testing from PowerBI
Launch the Power BI that is installed on your desktop, select “Get Data” and choose ODBC to connect using the DSN created in the previous step. Now the login process is initiated and you can access the data from Amazon Redshift using Power BI. In our environment, when the user connects from Power BI, they get directed to a log in screen.
Conclusion
This post showed you how to set up the Keycloak instance to allow the conversion of an OIDC token to a SAML based assertion that can be processed by Amazon Redshift ODBC Driver and allows a user to log in via SSO through an OIDC based protocol and consume their data from Microsoft PowerBI securely.
The above solution is the baseline for getting your connection up and going. As stated throughout the process, several steps allow for additional configuration. This includes adding additional fields to your user’s OIDC token to be extracted and passed via SAML assertion mappers and checked during the assume role process within AWS as an extra layer of user validation.