AWS Partner Network (APN) Blog
How Accenture Accelerates Building Enterprise Data Mesh Architecture on AWS
By Manash Kumar Taunk, Sr. Data Architect – Accenture
By Satish Sarapuri, Sr. Data Architect – AWS
By Ginni Malik, Sr. Data and ML Engineer – AWS
By Sachin Thakkar, Sr. Partner Solutions Architect – AWS
Accenture |
Enterprises have been trying to get to a place where they can easily use analytical data to improve their business decision-making. Aiming for standard, integrated data remains a goal for data-driven business, but this is made challenging because of rapid shifts in data needs, business priorities, and the divide between operations data and analytical data.
Data mesh is a decentralized approach to data management which strives to develop the data platform from a technology-led and project-centric model into a paradigm about federated business-led and product-centric data, by design. This opens the door for more scalability and flexibility as businesses develop.
In this post, we’ll walk through how Amazon Web Services (AWS) and Accenture are helping customers rapidly set up data mesh architecture on AWS leveraging the newly-announced Velocity platform. This solution aims to reduce time and effort to build data mesh architecture on AWS by up to 75%.
Specifically, we’ll walk through various data mesh architecture patterns on AWS and explain how Velocity’s Data Mesh Fabric component can minimize time and effort to set up data mesh architecture on AWS.
Accenture is an AWS Premier Tier Services Partner and Managed Services Provider (MSP) that offers comprehensive solutions to migrate and manage operations on AWS.
Data Mesh Governance Architecture Patterns
- Peer-to-peer
- Centrally federated
- Hybrid
Data mesh allows producers of data and consumers of data to easily communicate with one another. Information from across the enterprise is available to be discovered and understood. The responsibility for domain-specific data governance, quality, and domain expertise remains with the producers of the data, thus reducing the need for a transfer of ownership and knowledge to a central IT team.
There are different available data mesh governance architecture patterns. The unique requirements of each company drive the need to evaluate different architecture patterns available.
Peer-to-Peer
In this model, the data sharing is peer-to-peer. Data is owned, managed, and shared by each domain to other domains directly. The data is not registered with a central federation account. Only the metadata and governance of the data is centrally managed.
This is a decentralized approach, and the catalog itself lies with the data producer accounts. The central team is not responsible for any data management.
Figure 1 – Peer-to-peer governance.
Centrally Federated
In this approach, the distribution of the data is through a central layer. Each domain still has the autonomy on its data, but sharing of this data is done through this layer. This central layer has the catalog of registered data products, and the approach can enforce regulations and standards through the central layer.
Figure 2 – Central governance.
Hybrid
This approach is the combination of both peer-to-peer and centrally-federated approaches. Some enterprises may opt for this architecture due to complex team requirements. Within such large enterprises, some lines of businesses may opt for centrally-federated governance while others opt for peer-to-peer.
Figure 3 – Hybrid governance.
Accenture Velocity Platform
Accenture and AWS jointly invested and co-created the Velocity platform to eliminate barriers to innovation (now and into the future) and compress cloud-enabled business transformation up to 50% faster. As a result, customers can worry less about cloud complexities and spend more time creating real business value.
Velocity is an automated, repeatable, opinionated-yet-flexible platform that optimizes for business outcomes, including speed, resilience, scale, and agility. It removes the heavy lifting associated with building a cloud environment and applications.
Velocity includes a rich set of ready-to-use solutions and software delivery accelerators that teams can deliver to customers with the click of a button. The platform is continuously churning out new and improved AWS-powered industry, cross-industry, and technology solutions so customers can innovate faster, build better, and spend smarter.
Velocity offers a suite of product offerings that provide critical data platform functionality on the AWS cloud. These offerings include Data Lake Fabric and Data Mesh Fabric.
Velocity Data Lake Fabric
The Velocity Data Lake Fabric is a framework that provides a unified and consistent approach to managing and processing data in a data lake environment. It’s essentially an architecture that helps organizations build and manage their data lake infrastructure, including storage, processing, and management of data.
Velocity Data Mesh Fabric
Velocity Data Mesh Fabric is built to reduce time and effort to build data mesh architecture on AWS. It automates various steps required to set up a federated governance, onboard producer and consumer domains, register datasets, and access grant process at scale. Data Mesh Fabric provides domain-oriented decentralization of ownership and architecture, and supports the concept of serving data as a product.
With the federated data governance model, ownership, data accountability, accuracy, and access controls are governed by producers who understand and know their data better than a centralized IT team. Data Mesh Fabric also provides a self-service capability where producers create data products and publish them.
Consumers, meanwhile, can discover data products in a federated product catalog. They can request access to specific datasets based on their use case, and producers will review and approve subscription requests. All access requests are monitored, governed, and audited from a federated governance account.
Data Mesh Fabric was primarily designed to automate complex processes involved in building and maintaining a data mesh architecture. This accelerator provides a range of pre-built components and templates, automated deployment scripts, and tools to streamline the setup process. It allows for rapid onboarding of producer and consumer accounts onto data mesh architecture at scale.
Data Mesh Fabric also includes a user interface (UI) platform hosted in a federated governance account which is used to onboard new domains (AWS accounts) to platform as a producer or consumer with a single click.
The following diagram shows a typical federated governance data mesh architecture on AWS and the steps that are automated by Velocity Data Mesh Fabric.
Figure 4 – Federated governance reference architecture.
Leveraging both Data Mesh Fabric and Data Lake Fabric solutions, customers can set up an enterprise-scale data platform rapidly. Data Mesh Fabric will set up the components required for a data mesh architecture, whereas Data Lake Fabric can set up core data lake capabilities for multiple producer accounts.
Accelerating Data Mesh Architecture on AWS
In the following section, we take reference from a sample life sciences customer with multiple data domains and illustrate how Velocity Data Mesh Fabric can help in building data mesh architecture for various scenarios.
Scenario 1: Greenfield
The greenfield scenario covers customers who want to build a data mesh environment from the ground up, which can scale to hundreds of AWS accounts. This requires the creation of a centrally-federated governance AWS account, producer AWS account, and consumer AWS account.
Figure 5 – Greenfield scenario.
In this scenario, a life sciences customer can quickly scale to onboard either producers (research and clinical domains) and consumers (commercial and sample management domains) to a data mesh platform with a single click of button, and then initiate dataset creations for a given producer from a federated data governance account.
Scenario 2: Brownfield
The brownfield scenario covers customers who already have a data lake and want to onboard it to data mesh. This allows customers to migrate datasets from data lake accounts. You can migrate existing datasets and register them with data mesh in a federated governance account.
Figure 6 – Brownfield scenario.
In this case, the manufacturing life sciences domain data lake is onboarding its existing product to the data mesh platform. Using this solution, with a single click the domain owner can onboard their product from data lake to data mesh. As part of the onboarding process, all tables and permissions on the database/product would be migrated as well.
Scenario 3: Single Data Lake Migration
This scenario covers customers that have an existing single-account data lake which needs to be ported to data mesh. The process of single-account migration to data mesh is exactly- the same as the brownfield scenario.
Figure 7 – Single data lake migration.
In this scenario, customers have their own single AWS account data lake with its related domain product built and ready for use. Using this solution, with a single click domain owners can convert a single AWS account data lake to a data mesh producer and start using the data mesh platform. The solution migrates product metadata along with existing permissions to a federated data governance account.
Velocity Data Mesh Fabric Design Principles
Velocity Data Mesh Fabric was built on the following design principles:
- Rapid development: The Velocity Data Mesh Fabric is designed to facilitate rapid deployment, with minimal setup and configuration time. This framework provides a range of pre-built components and templates, automated deployment scripts, and tools to streamline the setup process. It also allows for rapid onboarding of producer and consumer accounts onto data mesh architecture at scale.
- Security: The framework is designed only to share full access to data product producer personas. It also gives select access to consumer personas along with data encryption at rest on Amazon Simple Storage Service (Amazon S3) using CMK-KMS defined by customer while creating S3 buckets.
- Automation: The framework is designed to automate the most complex processes involved in building and maintaining a data mesh architecture. It eliminates need for manual intervention, reducing risk of errors and streamlining the entire process.
- Speed to market: Data Mesh Fabric is designed to increase speed to market, allowing enterprises to quickly set up or migrate to data mesh architecture. It’s optimized to deploy and set up processes to onboard producers and consumers and register data products in a central governance account. It also manages access control in a data mesh architecture.
- Scalability: Data Mesh Fabric is designed to scale easily with minimal effort, and is capable of onboarding multiple producers and consumers onto a data mesh architecture with few parameters and clicks.
Integration with Amazon DataZone
Once Amazon DataZone is generally available (GA), it will provide an organization-wide business catalog and allow customers to discover, access, share, and govern data at scale across organizational boundaries. Thus, it removes the undifferentiated heavy lifting of making data and analytics tools accessible to everyone in the organization.
Velocity Data Mesh Fabric helps customers in reducing cost and effort while setting up technical components for business domain data lakes. These domains can be set up rapidly at scale with the right data security and governance.
Conclusion
The Accenture Data Mesh Fabric accelerator powered by AWS provides a one-stop solution for automated data platform creation, data transformation management, and data and analytics performed in a cloud-native way.
Data Lake Fabric and Data Mesh Fabric blocks can be implemented as standalone solutions or as an integrated workflow to fit the unique requirements of your business.
Accenture and AWS have worked together for more than a decade to help organizations realize value from their applications and data. The collaboration between the two companies, the Accenture AWS Business Group (AABG), enables enterprises to accelerate their pace of digital innovation and realize incremental business value from cloud adoption and transformation.
Connect with the AABG team at accentureaws@amazon.com to drive business outcomes by transforming to an intelligent data enterprise on AWS.
Accenture – AWS Partner Spotlight
Accenture is an AWS Premier Tier Services Partner and MSP that provides end-to-end solutions to migrate to and manage operations on AWS. By working with the Accenture AWS Business Group (AABG), a strategic collaboration by Accenture and AWS, organizations can accelerate the pace of innovation to deliver disruptive products and services.
Contact Accenture | Partner Overview | AWS Marketplace | Case Studies