Containers
How to containerize legacy code into Red Hat OpenShift on AWS (ROSA)
Introduction
Enterprise customers have trained their IT staff on legacy programming languages, like COBOL, for decades. These legacy programs have stood the test of time and still run many mission-critical business applications which are typical for these legacy platforms. While various migration solutions like AWS Blu Age and AWS Micro Focus Enterprise technology exist for legacy applications, this often means that the customer will have to learn a new programming language. In this post, we show you how to containerize legacy applications on AWS with minimal effort.
The Existing Legacy Application
COBOL is still widely used in applications deployed on mainframe computers which run large-scale batch and transactional processing jobs. The example COBOL application that we are going to use in this post, reads a comma separated values (CSV) input file and generates a report. The input file is regularly uploaded to the mainframe side via File Transfer Protocol (FTP) daily and a COBOL application processes it as a batch job.
In the following section, the example input file is a COBOL program that reformats the output into a tabular style.
Example input file:
The COBOL application:
The output file:
Solution overview
We are going to keep the code unchanged but wrap it up into a container on a Red Hat OpenShift on AWS (ROSA) cluster. The application is run as a cron
job. These jobs regularly (i.e., every minute) select the input files from a specific directory on a shared filesystem, generate the output, and place the output into the shared filesystem in another directory. We use an Elastic File System (EFS) filesystem to store the input CSV and output files to make them accessible by programs that provide the input files.
The following diagram shows the solution architecture:
These are the steps that we take to implement this solution:
- The feasibility of running the COBOL code on Linux
- Containerizing the code
- Preparing the AWS environment (code and container repositories, shared filesystem, and the OpenShift cluster)
- Deploying the code on ROSA
- Test the application
In the following sections, we show how each of the above-mentioned steps is performed. For further information, see the Red Hat OpenShift Service on AWS (ROSA) documentation.
1. The feasibility of running COBOL on Linux
Running COBOL applications is not limited to the IBM operating systems. There are a few open source COBOL compilers available that can be easily compiled for Linux environments. IBM’s official COBOL compiler is also an option, but for this demo we are going to use GnuCOBOL. While GnuCOBOL can be installed from the source code, in some environments (e.g., Ubuntu), the compiler is installed via the package managers easily:
Compile and run a test code:
With the COBOL compiler working on Linux, we have a similar environment on the Linux containers as well.
2. Containerization
COBOL is a compiled language, which means the program needs to be recompiled if the source code is changed. Depending on the use case, we might choose to include the COBOL compiler in the container image or not. If it is included, then the container dynamically builds the executable program from the source code. However, this add some latency to the runtime. Let’s discuss the pros and cons of each approach.
Create a container image based on the compiled and executable version of the program:
Pro: Faster container creation. Once the container is created from the image, it will be ready to execute the code immediately because the code is already pre-compiled.
Con: A new container image needs to be created if the code is modified. In other words, a pipeline will be needed to automatically compile the program and build an updated version of the container image.
Create a container image which can compile the program:
Pro: Less complexity in terms of building the pipeline for the container image.
Con: While this might be acceptable for long-running containers, this approach adds unnecessary compilation overhead. In other words, anytime the container is executed, the code will be compiled and occurs even if it has not been modified.
In this post, we chose the second approach because of its simplicity.
The following steps explain how to create a container image capable of compiling and running COBOL programs:
Package the code into a Docker image:
To better understand how the Dockerfile has be constructed, let’s review the different pieces and elements first:
/nfs_dir
: The shared filesystem to store the input CSV files in /nfs_dir/input/, and output reports in /nfs_dir/output/.
demo.cbl
: The COBOL program that transforms the input file and saves it into the output file. We use the same code that is already running on mainframe.
batch-process.sh
: A shell script which finds the input files in /nfs_dir/input directory and generates an output file for each one of them.
cobol-crob-job: A Linux crontab
configuration file which allows batch-process.sh to be executed regularly every minute:
The following sample Dockerfile copies cobol-cron-job
to the container. In the COPY sections, the necessary files are copied from the current directory in our workstation to the container image. In the RUN section, the required packages are installed and the cron mechanism is configured. And finally, the Command (CMD) section contains what executes when the container runs. It starts the cron daemon
and prints the cron.log
in the console:
Dockerfile:
Create the Docker image:
Test the Docker image locally:
Here is sample output:
Push the image to the repository. We will be using Amazon Elastic Container Registry (ECR):
3. Preparing an Amazon EFS file system for the cluster
Amazon EFS is a simple, serverless, elastic filesystem that makes it easy to set up, scale, and cost-optimize file storage. To get more information about Amazon EFS and how set it up from the AWS Management Console, please follow the step in this AWS blog post. In our scenario, Amazon EFS will be the shared Network File System (NFS) to store the inputs of the COBOL application that runs on ROSA. We also need to create an access point for the file system. According to the EFS documentation, Amazon EFS access points are application-specific entry points into an Amazon EFS file system that make it easier to manage application access to shared datasets.
Install the AWS EFS Operator on ROSA:
Make sure AWS EFS Operator has been installed on the cluster’s OperatorHub. This enables the ROSA cluster to understand how to interact with Amazon EFS. For more information about operators, please refer to the OpenShift documentation.
Create a shared volume in the ROSA cluster based on Amazon EFS:
To create an Amazon EFS-based shared volume through OpenShift’s EFS Operator, we need the file system and access point IDs from the Amazon EFS console or AWS Command Line Interface (CLI):
With the file system and access point IDs ready, we create a SharedVolume
object in the cluster. For more information, please visit the ROSA documentation.
Create a SharedVolume
:
As a result, a PersistentVolumeClaim is created and mounted in the pods:
4. Deploying the application on ROSA
The container image created is uploaded to a private repository in Amazon ECR. That is why we need to create a Secret in OpenShift to pull the image. Since the shared file system has also been created, that the last task is to define a pod and launch the application as part of it.
Don’t forget to modify the security group of the Amazon EFS mount target and allow the inbound access from the ROSA cluster. The worker nodes have a specific security group which can be allow-listed on the Amazon EFS side.
Create a Secret:
Save the Amazon ECR password to ~/.docker/config.json:
Create a generic password in OpenShift from the password file create above:
Now pods can pull images from ECR using the newly created secret:
Create a pod that mounts the shared file system and uses the Amazon ECR-based secret to pull the image:
5. Test the application
Confirm that the Amazon EFS file system been mounted on the pod:
Confirm the cron
job been configured properly on the pod:
It’s time to test the functionality of the application. We just need to copy a CSV file into /nfs_dir/input
, which resides on the Amazon EFS file system. The application that created above checks /nfs_dir/input
every minute and it processes the CSV files that it finds.
What is the application doing:
There is currently no CSV file in the shared file system to be processed. Consequently, our application is just printing the current timestamp. This is the log of the last three minutes:
Give the application a CSV file to process:
A CSV file can be copied to Amazon EFS from any workstation or application that has mounted the file system. For simplicity we use oc cp:
Poll the log file:
Prerequisites
To follow the procedures in this post, we recommend the following prerequisites:
- A Linux/Mac workstation, or an AWS Cloud9 IDE.
- A running Red Hat OpenShift cluster on AWS (ROSA). This post explains how to set up such an environment from scratch.
Cleanup
To avoid unexpected costs, please be mindful of cleaning up the resources that are no longer needed:
- ROSA cluster: This can be done via the Red Hat Hybrid Cloud Console or using he ROSA CLI.
- Amazon EFS file system: Use the AWS Management Console or AWS CLI.
AWS CodeCommit and Amazon ECR: Use the AWS Management Console, or AWS CLI.
Conclusion
We have shown you how to re-platform and even augment a COBOL application using container technologies and Red Hat OpenShift on AWS. While this is not a solution to modernize every legacy application, especially those with tight integration with legacy middleware and data layers, this post offers new avenues to bring other application types to Amazon AWS with minimal to no change to the code.
If you are interested in trying this in your environment, we would like to encourage you take this ROSA workshop and get some hands-on experience first. If you are new to Amazon EFS, follow the Amazon EFS user guide. For any further assistance, please use the AWS Premium Support or AWS re:Post.