AWS Partner Network (APN) Blog
Unlock Mainframe Data with Precisely Connect and Amazon Aurora
By Rochelle Grubbs, Sr. Director, Solution Architect – Precisely
By Ayan Ray, Sr. Partner Solutions Architect, Data and Analytics – AWS
By Maggie Li, Principal Software Engineer, Mainframe Modernization – AWS
By Radhika Chakravarty, Partner Solutions Architect, DB Specialist – AWS
Precisely |
Over 70% of Fortune 500 companies depend on mainframes for vital applications. However, mainframe costs climb relentlessly and scale remains constrained. These technical limitations stifle business growth and dynamism.
To curb spend and unlock scale, Precisely collaborates with Amazon Web Services (AWS) to provide the means to synchronize mainframe data in the cloud at speed and power modern applications.
Precisely Connect integrates data seamlessly from legacy systems into next-generation cloud and data platforms. Using Precisely Connect, data from sequential files, VSAM datasets, or databases such as IMS and Db2 can be transferred to Amazon Relational Database Service (Amazon RDS).
Amazon RDS offers many database solutions such as MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, and Amazon Aurora, which is a modern relational database service that offers performance and high availability at scale, fully open-source MySQL and PostgreSQL-compatible editions, and a range of developer tools for building serverless and machine learning (ML)-driven applications.
In this post, we will explain how AWS and Precisely can unlock the potential of mainframe data by liberating it from expensive and inelastic systems.
Precisely is an AWS Partner with more than 50 years of experience working with mainframe workloads. It has achieved the Amazon RDS Service Ready specialization by demonstrating significant technical expertise.
Customer Challenges
The mainframe is a powerful platform, but transferring data from it to the cloud can be challenging due to its unique characteristics.
One significant challenge is that mainframe data is often stored in complex, proprietary formats (like packed decimal) that are difficult to move to the cloud. This is largely due to mainframe’s long history, as storage was limited when the system was first developed decades ago, requiring the use of various compression and optimization techniques to minimize storage space.
Another challenge is the use of EBCDIC encoding, which is specific to mainframe data. This encoding is not compatible with modern applications that run on open systems and/or the cloud with ASCII or Unicode encoding. Converting hundreds of EBCDIC CCSIDs, each used to represent a specific language or country, to modern encoding systems is also a challenge.
In addition, mainframe data is often stored in legacy data stores, including relational database (Db2), keyed (VSAM), or hierarchical (IMS) databases, making it difficult to access and modernize.
Customers are looking to innovate faster than ever or offer modern applications that can support browser, software-as-a-service (SaaS), and mobile devices. Unfortunately, when data is stored in these legacy data stores it can be difficult or expensive to do so on the mainframe. Trying to run these applications on the mainframe also pulls computing power from what it’s designed to do, resulting in mismanaged use of resources.
Though mainframe modernization options are many, lowering costs is the key. Migrating mainframe workloads to AWS using Precisely Connect cuts spend and drives agility. Once in the cloud, quick iteration replaces the mainframe’s slow progress, and customers can keep up with the pace of innovation required by modern applications.
Precisely Connect Overview
Precisely Connect offers solutions that enable the quick and efficient replication of mainframe data to the target platform, while automated conversion from EBCDIC to ASCII enables easy local access to the data.
This allows organizations to seamlessly utilize data from both mainframe and cloud platforms in a cost-effective manner. Consider a financial institution where banking application data is processed on the mainframe. Waiting for statements to arrive in the mail or calling the bank to check one’s balance are practices of the past.
Customers now demand instant access to their accounts 24/7 and the ability to self-serve. These modern applications, which support various platforms such as browsers and mobile devices, do not run on the mainframe but rely on the vital data processed and stored there.
Application modernization powered by Precisely Connect captures changes from the mainframe applications, convert, decompress and de-crypts the data, and delivers it to AWS in real time.
Once on the AWS Cloud, agile teams can accelerate the pace of innovation using modern tools and AWS services. This modernization extends the use of the mainframe application by supporting mobile applications and data access that today’s customers have come to expect.
Benefits of Amazon Aurora
Amazon Aurora is a fully managed AWS database service that automates time-consuming administration tasks like hardware provisioning, database setup, patching, and backups while providing the security, availability, and reliability of commercial databases at 1/10th the cost.
Taking advantage of the agility of Amazon Aurora, you can unlock the value of mainframe data and embark on your modernization journey. Aurora seamlessly supports high volume, highly concurrent OLTP workloads, reducing punitive costs of legacy mainframe systems while delivering a streamlined and consistent user experience.
Aurora features a distributed, fault-tolerant, and self-healing storage system that’s decoupled from compute resources and auto-scales up to 128 TB per database instance. It delivers high performance and availability with up to 15 low-latency read replicas, point-in-time recovery, continuous backup to Amazon Simple Storage Service (Amazon S3), and replication across three AWS Availability Zones (AZs).
To meet your connectivity and workload requirements, Aurora horizontal auto scaling dynamically adjusts the number of Aurora replicas provisioned for an Aurora cluster using single-master replication. This enables your Aurora cluster to handle sudden increases in connectivity or workload.
When the connectivity or workload decreases, Aurora auto scaling removes unnecessary Aurora replicas so you don’t pay for unused provisioned database instances. To top it off, you don’t have IOPS limitations in Aurora, as the throughput of the underlying instance class in combination with the amount of workload you push determines the amount of IOPS you can realize on a provisioned Aurora cluster.
Example: Db2 to Aurora Postgres
Figure 1 illustrates Precisely Connect CDC (SQData) capabilities to support both mainframe source databases and AWS target environments.
- On the mainframe side, it employs change data capture (CDC) to extract data in real time from Db2 z/OS, IMS DB, or VSAM files.
- On the AWS side, it supports various offerings under Amazon RDS including Amazon Aurora, Microsoft SQL Server, MySQL, Oracle, and PostgreSQL.
- Additionally, it allows for transfer of data to Amazon Managed Streaming for Apache Kafka (Amazon MSK), which can then be synchronized with platforms like Amazon S3, Amazon Redshift, Amazon DynamoDB, and other AWS analytics solutions.
Figure 1 – Precisely Connect high-level architecture overview.
Below, Figure 2 demonstrates the process of modernizing and replicating Db2 data from a mainframe system to Amazon RDS.
Precisely Connect CDC (SQData) constantly monitors changes made to the source Db2 tables and subsequently transfers those updates to RDS. This method can be employed to achieve your objectives related to modernizing your applications.
- Capture/Publisher: Connect CDC Capture/Publisher captures Db2 changes from Db2 logs using IFI 306 Read and communicates captured data changes to a target engine through TCP/IP.
- Controller Daemon: The Controller Daemon authenticates all connection requests, managing secure communication between the source and target environments.
- Apply Engine: The Apply Engine is a multi-faceted and multi-functional component in the target environment. It receives the changes from the Publisher agent and performs all necessary data filtering, transformation, and augmentation required to apply the changed data to the target Amazon RDS.
Figure 2 – Transferring Db2 data from mainframe to Amazon RDS.
Precisely Connect CDC (SQData) employs the ISPF user interface (UI) on the mainframe and scripts on AWS to facilitate the setup of the replication stream between mainframe Db2 and RDS.
In the steps below, we’ll demonstrate how to establish the Db2 Capture/Publisher on the mainframe, implement scripts for the Apply Engine on AWS for data replication, and run the Apply Engine scripts to replicate data from Db2 to Amazon RDS.
Step 1: Specify Tables to be Captured on Mainframe Through ISPF Panels
Source tables can be added either by selecting them from the Db2 catalog or by entering them manually. In this example, we’ll select the tables from the catalog whose qualifier matches SQDT%ST. To do this, enter a wildcard value (%) for the table qualifier and the table name.
Figure 3 – Add source tables to an engine.
On pressing “Enter,” a list of the source tables that meet the selection criteria is displayed on this panel. We can select the tables we want to add to our Db2 Capture by typing a “S” next to the table in the list.
Figure 4 – Select tables for replication.
Step 2: Create Scripts for Apply Engine to Run and Connect with Capture/Publisher
The Precisely Connect CDC Apply Engine is a versatile and multi-functional component that can read and write to any type of data store, such as files, databases, message queues, or Kafka. It can handle any type of data structure, including relational DDL, COBOL copybooks, JSON, and comma-delimited files.
The Apply Engine’s most common use is to process data published by the Db2 CDC component, apply business rules if necessary, transform the data, and efficiently write it to the target Amazon RDS.
The Apply Engine is managed by a SQL-like scripting language that offers a wide range of operations, from replicating identical source and target structures using a single command to complex business rule-based transformations.
Its commands and functions provide comprehensive procedural control of data filtering, mapping, and transformation, including manipulation of data at its most basic level if necessary.
A simple Apply Engine script looks like this:
Figure 5 – Apply Engine script.
Step 3: Run the Apply Engine Scripts to Replicate Data from Db2 to RDS
The Apply Engine can be executed on AWS using the command line interface (CLI), shell scripting, and the SQDMON utility.
The following example demonstrates how to run an Apply Engine script named demodb2pg.sqd, which performs near real-time replication of “CUSTOMER” application data from a Db2 database captured on a z/OS system named “ZOS1” to RDS tables running on AWS.
The source “Host” system and Db2 System ID are passed as parameters because the engine script is written in a way it can be repurposed to process data from other systems. The command is as follows:
sqdeng --script=./ENGINE/demodb2pg.sqd ENGINE=DEMODB2 HOST=ZOS1 SSID=DB2T --list=./ENGINE/demodb2pg.prt > ./ENGINE/demodb2pg.rpt
When the Apply Engine is executed, it generates a runtime report that provides statistics for each data store processed during execution. A sample runtime report appears as follows:
Figure 6 – Runtime report.
Step 4: Check the Data in the Target Database
At the end, users can query the target database to validate the record counts and data records to make sure the changed data is replicated in the target tables.
Figure 7 – Validate the row counts of the target table.
Conclusion
The features highlighted in this post are sought-after for mainframe application modernization. Getting data from the mainframe to a modern cloud platform is not simple, and Precisely provides a no-code interface for replicating enterprise data from legacy mainframe systems to scalable and elastic cloud computing platforms such as AWS.
Precisely and AWS together provide a user-friendly way to unlock data from the mainframe and power new and improved applications.
Precisely – AWS Partner Spotlight
Precisely is an AWS Partner with more than 50 years of experience working with mainframe workloads. It has achieved the Amazon RDS Service Ready specialization by demonstrating significant technical expertise.