AWS for Industries

How to enable individual user’s Microsoft Power BI access to Amazon Redshift using ODBC Driver, Keycloak & SAML 2.0

Xylem (NYSE: XYL) is a leading global water technology company committed to solving critical water and infrastructure challenges with innovation. One of those innovative solutions is the Sensus Utility Data Lake, where customers can capture data from their smart meters and use self-service/low code tooling to draw novel analytic insights from their data. The main benefit is to give customers the analytical abilities to satisfy their regional specific analysis and regulatory needs, which differ from the greater population of utilities served by the existing product portfolio.

Xylem built Sensus Utility Data Lake on AWS and stored the utilities’ data in Amazon Redshift. Customers access and analyze the data in Amazon Redshift using Power BI. To secure and protect users’ data, we integrated Single-Sign-On (SSO) credentialing directly to Amazon Redshift clusters to facilitate individual user connections via an Identity Provider.

This post demonstrates how Xylem and AWS partners worked together to enable secure connectivity using Amazon Redshift, AWS Identity and Access Management (IAM), Keycloak and Power BI. We are also explaining how to overcome the challenges associated with the ODBC driver and its support for OpenID Connect (OIDC) using SAML based assertion and OIDC tokens to establish the connectivity between Power BI and Amazon Redshift using Amazon Redshift ODBC Driver.

Key components

Let’s review the concepts of the key components used in the solution.

Amazon Redshift is a fully-managed petabyte-scale cloud data warehouse service that uses SQL to analyze the structure, semi-structured data across databases, and data lakes, as well as provides a secure and reliable analytics service. Amazon Redshift also offers a serverless option that automatically provisions and scales data warehouse capacity to meet the requirements for demanding and unpredictable workloads.

IAM is a web service that secures and controls access to the resources in AWS. It also handles the authentication and authorization to use resources. It helps administrators manage permissions in a central location and control users’ accessibility to AWS resources. IAM also has the SAML 2.0 (Security Assertion Markup Language 2.0) identity provider (IdP) entity which supports SAML 2.0 standard.

Keycloak is an open-source IAM tool, it’s based on standards such as SAML 2.0, OIDC, and OAuth2.0. Keycloak provides SSO, User Federation, Identity Brokering, Social Login, etc.

Power BI is a data visualization tool developed by Microsoft that offers enterprise business intelligence (BI) capabilities. You can use Power BI to perform ad-hoc query analysis, data visualization, and create dashboards. Moreover, refer to Amazon QuickSight, a fast cloud-based business analytics service that is serverless and requires no up-front investment in expensive hardware and software. QuickSight provides quick insights from disparate data sources, such as on-premises databases and SaaS applications like Salesforce, and seamlessly integrates with Amazon Redshift,

Solution overview

Amazon Redshift provides Java Database Connectivity (JDBC), Python, and Open Database Connectivity (ODBC) drivers to connect from various SQL clients. The drivers provided by AWS support IAM, SSO, and federated authentication.

For secure access, Xylem has implemented SSO using Keycloak, an open authentication protocol that is an extension of OAuth 2.0. In OIDC, the credentials are never shared with any applications and OIDC tokens are exchanged between the applications for authentication and authorization.

In this solution, we access Amazon Redshift from the Microsoft Power BI Desktop. Power BI isn’t compatible with JDBC drivers, so we use Amazon Redshift ODBC Driver for this integration. ODBC drivers don’t support OIDC authentication. Given this restriction of the ODBC driver with OIDC, we use the (i.e.) “Identity Provider: Browser SAML” option. ODBC driver supports SAML-based authentication and Keycloak has the capabilities to process SAML based assertion.

Enterprise-wide, Xylem uses OIDC, which is implemented via Keycloak. To connect Power BI Desktop with Amazon Redshift, we use SAML-based authentication instead of OIDC due to the restrictions in the ODBC driver. Keycloak supports both OIDC and SAML 2.0, so OIDC to SAML conversion and vice versa are possible. We enabled our users to log in via SSO using the OIDC protocol.

Prerequisites

The following are the prerequisites to implement this solution:

  • Amazon Redshift ODBC Driver v1.4.49.100
  • Knowledge of IAM, Keycloak, OIDC, and SAML 2.0
  • Microsoft Power Desktop version 2.100.1381.0 64-bit
  • Keycloak version 16.1.1
  • A new realm in your Keycloak instance configured with an existing OIDC based client

Solution walkthrough

Let’s review the interaction between the various actors in this solution.

  • A business user launches their Power BI desktop session.
  • Power BI desktop uses the Amazon Redshift ODBC driver that is installed on the user’s workstation.
  • Amazon Redshift ODBC Driver is configured to use SAML-based authentication and it interacts with the SAML client by sharing a SAML token.
  • Keycloak’s SAML client relays the SAML token to Keycloak’s OIDC provider for authentication.
  • Keycloak SAML client converts OIDC token to SAML-based assertion and completes the authentication using IAM SAML identity provider.
  • Using SAML tokens, the ODBC driver establishes a connection with Amazon Redshift.
  • Now, let’s walk through the steps involved in setting up Keycloak, IAM, Amazon Redshift ODBC Driver, and Power BI Desktop.

Before we begin, we highly recommend that you create a new realm in Keycloak to implement this solution so that your existing realm stays intact. Refer to Keycloak documentation for steps to create a new realm.

Create OIDC IdP

First, let’s assign a name to your new realm, say “awsDemo,” and provide the “Frontend URL.”

Now let’s create an OIDC-based IdP within the “awsDemo” realm by providing an Alias, a Display Name, thereby enabling the Store Tokens, setting the First Login Flow and Sync Mode.

Set the Sync mode to “force” to make sure that the mappers for users are updated on each successful login, in case of changes in the user’s level of access.

Parameters such as Authorization URL, Token URL, Client ID, and Client Secret are set based on the original OIDC and set Client Authentication to “Client secret sent as post.”

Now, we must set up the mappers to extract values from the OIDC token that will be sent via SAML assertion to AWS. For this purpose, we create two mappers, namely “DbGroups” and “roleArn,” based on the values from the following tables:

Attribute Name Attribute Value
ID <<Autogenerated>>
Name roleArn
Sync Mode Override Inherit
Mapper Type Attribute Importer
Claim {name of claim to search for in token}
User Attribute Name roleArn

The roleArn is the AWS role that is assumed by the user and is synced up into a string with AWS SAML IdP.

Attribute Name Attribute Value
ID <<Autogenerated>>
Name DbGroups
Sync Mode Override Inherit
Mapper Type Attribute Importer
Claim DbGroups
User Attribute Name redshiftGroup

Create SAML Client

Now let’s create a new SAML client by importing the AWS-provided SAML metadata found at https://signin.thinkwithwp.com/static/saml-metadata.xml and provide values for “Valid Redirect URLs,” “IDP Initiated SSO URL Name” as specified in the following illustration:

After the previous step, the “Target IDP initiated SSO URL” and “Assertion Consumer Service POST Binding URL” are generated in the following format:

Attribute Name Attribute Value
Target IDP initiated SSO URL {Frontend URL login}/realms/{realm name}/protocol/saml/clients/{IDP Initiated SSO URL Name}
Assertion Consumer Service POST Binding URL http://localhost:7890/redshift/

In SAML Client, the values in the OIDC token are parsed and stored in the mappers. For this purpose, we must define the mappers for Role, Role Session Name, DB User, and DbGroups. Refer to the following to create the mappers:

Attribute Name Attribute Value
ID <<Autogenerated>>
Name Role
Friendly Name Role
Mapper Type User Attribute
User Attribute roleArn
SAML Attribute Name https://thinkwithwp.com/SAML/Attributes/Role
SAML Attribute Name Format Basic
Attribute Name Attribute Value
ID <<Autogenerated>>
Name RoleSessionName
Friendly Name RoleSessionName
Mapper Type User Attribute
User Attribute username
SAML Attribute Name https://thinkwithwp.com/SAML/Attributes/RoleSessionName
SAML Attribute Name Format Basic
Attribute Name Attribute Value
ID <<Autogenerated>>
Name DbUser
Friendly Name DbUser
Mapper Type User Attribute
User Attribute username
SAML Attribute Name https://redshift.amazon.com/SAML/Attributes/DbUser
SAML Attribute Name Format Basic
Attribute Name Attribute Value
ID <<Autogenerated>>
Name DbGroups
Friendly Name DbGroups
Mapper Type User Attribute
User Attribute redshiftGroup
SAML Attribute Name https://redshift.amazon.com/SAML/Attributes/DbGroups
SAML Attribute Name Format Basic

Finally, create a mapper named “AutoCreate” to generate a temporary user within Amazon Redshift with the naming convention “IAM: {DbUser}/”

Create IAM SAML Identity Provider

The verified user identity from Keycloak is sent to AWS. For this purpose, we must configure a SAML provider in IAM using the SAML document downloaded from Keycloak. To create an IAM SAML identity provider, follow the steps in this documentation.

Create IAM SAML Identity Provider

Now let’s create a role and policy using the following code snippet. The temporary user will assume this role. Refer to this documentation for creating an IAM role.
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Federated": "{arn of your AWS SAML IdP}"
            },
            "Action": "sts:AssumeRoleWithSAML",
            "Condition": {
                "StringEquals": {
                    "SAML:aud": "http://localhost:7890/redshift/",
                    "SAML:iss": "{Front End URL}/realms/{realm name}"
                }
            }
        }
    ]
}

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "redshiftCredentials",
            "Effect": "Allow",
            "Action": "redshift:GetClusterCredentials",
            "Resource": [
                "arn:aws:redshift:*:*:dbname:{redshift cluster name}/{database name}",
                "arn:aws:redshift:*:*:dbuser:{redshift cluster name}/${redshift:DbUser}",
                "arn:aws:redshift:*:*:cluster:{redshift cluster name}",
            ],
            "Condition": {
                "StringLike": {
                    "aws:userid": "*:${redshift:DbUser}"
                }
            }
        },
        {
            "Sid": "redshiftUserCreation",
            "Effect": "Allow",
            "Action": "redshift:CreateClusterUser",
            "Resource": [
                "arn:aws:redshift:*:*:dbuser:{redshift cluster name}/${redshift:DbUser}"
            ]
        },
        {
            "Sid": "redshiftJoinGroup",
            "Effect": "Allow",
            "Action": "redshift:JoinGroup",
            "Resource": [
                "arn:aws:redshift:*:*:dbgroup:{redshift cluster name}/{name of redshift group}"
            ]
        },
        {
            "Sid": "redshiftCommonActions",
            "Effect": "Allow",
            "Action": [
                "redshift:ListSchemas",
                "redshift:DescribeQuery",
                "redshift:ListDatabases",
                "redshift:CreateClusterUser",
                "redshift:DescribeClusters",
                "redshift:ExecuteQuery",
                "redshift:FetchResults",
                "redshift:DescribeTable",
                "redshift:ListTables",
                "redshift:CancelQuery"
            ],
            "Resource": "*"
        }
    ]
}

In the IAM policy, you must provide your database name, cluster identifier, and Amazon Redshift group that the user will join. AWS will apply the temporary “DbUser” in the sections {redshift: DbUser}.

Install ODBC Driver and Setup DSN

Now you must install the Amazon Redshift ODBC driver on the user’s computer to access Amazon Redshift. Follow the instructions in this documentation to install the ODBC driver. Let’s set the “Pool Connections to this driver” after successfully installing the ODBC driver.

In the ODBC driver settings, let’s navigate to the Connection Pooling and enable “Pool Connections to this driver” for 60 seconds.

Navigate to the Connection Pooling and enable “Pool Connections to this driver” for 60 seconds

Let’s create an ODBC Data Source Name (DSN). In our case, we create a DSN for Amazon Redshift, so you should have the server, port, and database information handy. For detailed instructions on creating a DSN, refer to this documentation. As this ODBC driver will use SAML based authentication, select the auth type in the DSN as “Identity Provider: Browser SAML”, and provide the Login URL port from your Keycloak instance and set the timeout as 60 seconds. After entering the information, your configuration should look something like the following:

Keycloak instance configuration

Test the DSN connection to Amazon Redshift before selecting the “OK” button.

Testing from PowerBI

Launch the Power BI that is installed on your desktop, select “Get Data” and choose ODBC to connect using the DSN created in the previous step. Now the login process is initiated and you can access the data from Amazon Redshift using Power BI. In our environment, when the user connects from Power BI, they get directed to a log in screen.

Xylem login window

Conclusion

This post showed you how to set up the Keycloak instance to allow the conversion of an OIDC token to a SAML based assertion that can be processed by Amazon Redshift ODBC Driver and allows a user to log in via SSO through an OIDC based protocol and consume their data from Microsoft PowerBI securely.

The above solution is the baseline for getting your connection up and going. As stated throughout the process, several steps allow for additional configuration. This includes adding additional fields to your user’s OIDC token to be extracted and passed via SAML assertion mappers and checked during the assume role process within AWS as an extra layer of user validation.

Christopher Hardison

Christopher Hardison

Christopher Hardison is a software engineer on the Data Lake team for Xylem. He enjoys helping find solutions for challenging problems. Christopher holds a B.S. in Computer Science from Oregon State University and a B.S. in Exercise Physiology from East Carolina University. In his spare time, he enjoys all things fitness and traveling.

Brandon O’Daniel

Brandon O’Daniel

Brandon O’Daniel is the Director of the Data Lake and Data Science teams for Sensus, a Xylem brand that specializes in smart infrastructure, metrology and communications in the utilities industry. With more than two decades of Enterprise IT experience, Brandon specializes in large scale data aggregation, serverless cloud computing and divining insights through machine learning and business intelligence.

Karish Chowdhury

Karish Chowdhury

Karish Chowdhury is Senior Solutions Architect at AWS. He is a cloud enthusiast and very passionate about finding solutions for customers. He is Customer Obsessed and his working backward methodology helps customers to achieve their business goals.

Mrunal Daftari

Mrunal Daftari

Mrunal Daftari is an Enterprise Senior Solutions Architect at AWS. Mrunal is based in Boston, MA. He is a cloud enthusiast and very passionate about finding solutions for customers that are simple and addresses their business outcomes. He loves working with cloud technologies providing simple, scalable solutions that drive positive business outcomes, cloud adoption strategy, design innovative solutions and drive operational excellence.

Prathap Thoguru

Prathap Thoguru

Prathap Thoguru is an Enterprise Solutions Architect at AWS. He’s an AWS certified professional in nine areas and specializes in data and analytics. He helps customers get started on and migrate their on-premises workloads to the AWS Cloud. He holds a Master's degree in Information Technology from the University of Newcastle, Australia.

Venky Krishnan

Venky Krishnan

Venky Krishnan is a Principal Engineer in the Enterprise Digital Success team at Xylem. He has more than two decades of software experience ranging between utility and telecom industries. He holds a M.S in Computer Science and M.B.A from NC State University. When not tackling software problems, he enjoys outdoor with distance running.