IBM & Red Hat on AWS
Accelerating Code Conversion with Amazon SageMaker and IBM Granite Code models
As enterprises modernize their mission-critical applications to adopt cloud-native architectures and containerized microservices, a major challenge is converting legacy monolithic codebases to modern languages and frameworks. Manual code conversion is extremely time-consuming, expensive, and error prone. Fortunately, recent advances in large language models (LLMs) for code have opened up new possibilities for AI-assisted code conversion at scale.
Amazon SageMaker is a leading machine learning platform that makes it easy to build, train, and deploy machine learning models in the cloud and at the edge. IBM has recently open-sourced its Granite family of code LLMs for code generation, translation, fixing bugs, and more across over 100 programming languages. By combining the strengths of SageMaker and Granite Code models, enterprises can now accelerate legacy code conversion projects, or just take advantage of a general-purpose code model.
In this post, you will leverage Granite models on Amazon SageMaker for accelerating legacy code conversion and modernization use cases.
What are IBM Granite Code models?
The IBM Granite models are a family of IBM-built and trained foundation models pre-trained on over 3 trillion tokens of code and natural language data across 116 programming languages. A key differentiator for the IBM Granite models is trust. The underlying data in the models are trained in accordance with IBM’s AI ethics principles and go through stringent data governance requirements. According to Forrester, the Granite family of models provides enterprise users with some of the most robust and clear insights into the underlying training data. As a result, IBM endorse their Granite models with intellectual property indemnification, meaning they will handle any copyright infringement claims related to the use of these models. Additionally, Granite models are tested and reviewed using criteria for more than 40 social harms and risks, ensuring their safety and reliability.
The Granite code models come in a variety of sizes, ranging from 3B to 34B parameters, and come in base and instruction-following variants. These models show strong performance, comparing favourably to other open-source code models. For example, on the HumanEval benchmark for code generation, the Granite 8B code-base model for generating, explaining and fixing code out-performed in evaluations covering code synthesis, debugging, explanation, editing, mathematical reasoning and more (figure 1). Further details can be found in IBM’s Granite Code Models paper.
IBM has released the Granite Code models to open source under the permissive Apache 2.0 license, enabling their use for both research and commercial purposes with no restrictions. The models are available on Hugging Face.
Hugging Face is a popular open-source hub for machine learning (ML) models. AWS and Hugging Face have a partnership that allows a seamless integration through SageMaker with a set of AWS Deep Learning Containers (DLCs) for training and inference in PyTorch or TensorFlow, and Hugging Face estimators and predictors for the SageMaker Python SDK. SageMaker features and capabilities help developers and data scientists get started with natural language processing (NLP) on AWS with ease.
What is Amazon SageMaker?
Amazon SageMaker is a fully-managed machine learning service that provides every component needed to build, train, and deploy machine learning models quickly. It eliminates the heavy lifting of infrastructure management so data scientists and developers can focus on the machine learning problem at hand.
The SageMaker ecosystem includes a variety of integrated tools for data preparation, model building, training, tuning, hosting, monitoring, and more. SageMaker also adheres to standard security frameworks such as ISO27001 and SOC1/2/3 in addition to complying with various regulatory requirements. Compliance frameworks like General Data Protection Regulation (GDPR) and California Consumer Privacy Act (CCPA), Health Insurance Portability and Accountability Act (HIPAA), and Payment Card Industry Data Security Standard (PCI DSS) are supported to make sure data handling, storing, and process meet stringent security standards.
Code generation and conversion use cases
To illustrate the power of combining SageMaker and IBM Granite models, let’s walk through an example use cases of accelerating C to Java code conversion. You can follow a step-by-step notebook in GitHub.
Before proceeding with the implementation, please be aware that deploying this model on Amazon SageMaker will incur costs. For detailed pricing information, please refer to the Amazon SageMaker pricing page, and you can estimate your potential expenses using the AWS Pricing Calculator.
Pre-requisites
For this use case, you’ll need an AWS account and an existing SageMaker domain. You will be using SageMaker Studio, an integrated development environment (IDE) that provides a single web-based visual interface where you can access purpose-built tools to perform all ML development steps, from preparing data to building, training, and deploying your ML models.
In the example we will use an ml.g5.12xlarge instance type for hosting the Granite-20B-Code-Instruct-8K model. to increase the ml.g5.12xlarge for endpoint usage quota to 1 if it is currently set to 0. For more information, refer to Requesting a quota increase.
Implementation steps
Initial setup
The following instructions enable you to run a notebook in JuypterLab in SageMaker Studio for the first time if you have not done this before. For more detail, refer to the documentation.
- Open the Amazon SageMaker console.
- In the navigation pane, under Applications and IDEs choose Studio.
- Choose Create a SageMaker domain on the SageMaker Studio UI (figure 2).
- Select Setup for a single user to create a SageMaker domain and a user profile automatically. This is the quickest way to get started using the default settings.
- Choose Set up (figure 3).
- Once the SageMaker Domain status is set to Ready, choose the User Profiles tab. A user profile will have been automatically created for you. Choose Launch to display the Personal apps drop-down list, then choose Studio.
- In SageMaker Studio web console navigation pane, under Applications choose JuypterLab as the web-based IDE for running the notebook.
- Next choose the button Create JuypterLab space.
- In the Create JupyterLab space pop dialog, specify a Name such as granite-sagemaker.
- Choose the Sharing option that best fit your requirements, or keep the default of private.
- Choose Create space.
The JuypterLab space uses a single Amazon Elastic Compute Cloud (EC2) instance for your compute and single Amazon Elastic Block Store (EBS) volume for your storage.
- Select ml.m5.xlarge from the Instance dropdown list.
- Keep all other default values, including the storage at 5GB.
- Choose the Run space
Once launched, you should see something similar to the following screenshot.
- From the Create JupyterLab space page, choose the button to Open JuypterLab.
You can then download the following notebook and upload it into your JupyterLab environment choosing the Upload Files button, or create a new Python 3 notebook.
In the following sections, you will be creating a new notebook and adding code snippets. To add new cell, choose the + button on the menu bar. To run each cell, you can either press Ctrl+Enter or choose the Run button on the top Jupyter Notebook’s toolbar.
Deploying Granite Code Models to SageMaker
Start by creating a new Jupyter notebook:
- From your JupyterLab web-based IDE, on the Launcher tab, under Notebook choose Python 3 (ipykernel) as seen in the following image.
- Copy the code below into the first cell of your notebook. It will prepare the environment by installing or upgrading the SageMaker Python SDK using pip:
!pip install --upgrade pip
!pip install -U sagemaker -q
Set up your environment for deploying a Hugging Face model on Amazon SageMaker. This involves importing the required libraries, initializing a SageMaker session, retrieving account information and specifying the TGI container image.
- Add a new cell in your notebook, by choosing the Plus button in notebook’s toolbar (figure 6).
- Paste the code below to the new cell:
import json
import sagemaker
from sagemaker.huggingface import HuggingFaceModel
from sagemaker import get_execution_role
from sagemaker.huggingface import get_huggingface_llm_image_uri
sagemaker_session = sagemaker.Session()
account_id = sagemaker_session.account_id()
role = sagemaker.get_execution_role()
region = sagemaker_session.boto_region_name
# Use latest container image (2.2.0 when this blog was written) for the Granite models
image_uri=get_huggingface_llm_image_uri("huggingface",version="2.2.0")
# print ecr image uri
print(f"llm image uri: {image_uri}")
The HuggingFaceModel handles downloading the Granite model and dependencies, packaging them into a Docker container, and deploying to a SageMaker inference endpoint.
The deployment process requires specifying a number of environment variables, including:
- HF_MODEL_ID: This corresponds to the model from the HuggingFace Hub that will be deployed.
- SM_NUM_GPUS: This specifies the tensor parallelism degree of the model. Tensor parallelism is used to split the model across multiple GPUs, which is necessary when working with Large Language Models (LLMs) that are too big for a single GPU.
When setting SM_NUM_GPUS, it should match the number of available GPUs on the selected instance type. In this example, the ml.g5.12xlarge instance type is used, which has 4 available GPUs. Therefore, SM_NUM_GPUS is set to 4.
- Create a new cell and add the following code to create the HuggingFaceModel:
# sagemaker config
instance_type = "ml.g5.12xlarge"
number_of_gpu = 4
health_check_timeout = 600
# Hub model configuration
hub = {
"HF_MODEL_ID": "ibm-granite/granite-20b-code-instruct-8k",
"SM_NUM_GPUS": json.dumps(
number_of_gpu
),
}
# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
image_uri=image_uri,
env=hub,
role=role,
)
- Copy the code below to a new cell. This will deploy the HuggingFaceModel to Amazon SageMaker using the deploy method:
# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type=instance_type,
container_startup_health_check_timeout=health_check_timeout,
)
SageMaker will now create your endpoint and deploy the model to it. This can take 10-15 minutes.
Code conversion
Now that you have the Granite Code model loaded and deployed to a SageMaker endpoint, you can start generating or converting code.
In this example, you want to convert code from one programming language to another. The prompt below converts a code snippet from C to Java.
Specifically, in this blog we cover common programming constructs like linked lists and file I/O operations. The C code is converted to Java while preserving the functionality and logic. In the Java code, we utilize classes, objects, and Java-specific APIs like FileWriter and BufferedReader to achieve similar results as the C code.
- Copy the prompt below and add it into a new cell.
prompt = """
Question:
Translate the following code from C to Java.
C Code:
```c
#include <stdio.h>
#include <stdlib.h>
typedef struct Node {
int data;
struct Node* next;
} Node;
Node* createNode(int data) {
Node* newNode = (Node*)malloc(sizeof(Node));
newNode->data = data;
newNode->next = NULL;
return newNode;
}
void addNode(Node** head, int data) {
Node* newNode = createNode(data);
if (*head == NULL) {
*head = newNode;
return;
}
Node* temp = *head;
while (temp->next != NULL) {
temp = temp->next;
}
temp->next = newNode;
}
void printList(Node* head) {
Node* temp = head;
while (temp != NULL) {
printf("%d ", temp->data);
temp = temp->next;
}
printf("\n");
}
int main() {
Node* head = NULL;
addNode(&head, 1);
addNode(&head, 2);
addNode(&head, 3);
printList(head);
return 0;
}
```
Java Code:
```java
class Node {
int data;
Node next;
Node(int data) {
this.data = data;
next = null;
}
}
class LinkedList {
Node head;
void addNode(int data) {
Node newNode = new Node(data);
if (head == null) {
head = newNode;
return;
}
Node temp = head;
while (temp.next != null) {
temp = temp.next;
}
temp.next = newNode;
}
void printList() {
Node temp = head;
while (temp != null) {
System.out.print(temp.data + " ");
temp = temp.next;
}
System.out.println();
}
public static void main(String[] args) {
LinkedList list = new LinkedList();
list.addNode(1);
list.addNode(2);
list.addNode(3);
list.printList();
}
}
```
<end of code>
Question:
Translate the following code from C to Java.
C Code:
```c
#include <stdio.h>
#include <stdlib.h>
int main() {
FILE* file = fopen("example.txt", "w");
if (file == NULL) {
printf("Error opening file!\n");
return 1;
}
fprintf(file, "This is an example of writing to a file.\n");
fclose(file);
file = fopen("example.txt", "r");
if (file == NULL) {
printf("Error opening file!\n");
return 1;
}
char buffer[100];
while (fgets(buffer, sizeof(buffer), file) != NULL) {
printf("%s", buffer);
}
fclose(file);
return 0;
}
```
Answer:
"""
- Use the predict method from the predictor to run inference on your endpoint. You can inference with different parameters to impact the generation. Parameters can be defined as in the parameters attribute of the payload.
# hyperparameters for llm
payload = {
"inputs": prompt,
"parameters": {
"do_sample": True,
"top_p": 0.6,
"temperature": 0.1,
"top_k": 50,
"max_new_tokens": 1000,
"repetition_penalty": 1.03,
"stop": ["<end of code>"],
},
}
# send request to endpoint
response = predictor.predict(payload)
print(response[0]["generated_text"][len(prompt) :])
The output contains Java code similar to the one seen in the following image.
Cleanup
Once you’ve completed the code conversion process, you can delete the SageMaker endpoint and intermediate artifacts like the SageMaker model to stop incurring charges using the code snippet below in a new cell:
predictor.delete_model()
predictor.delete_endpoint()
You should also delete or stop your Studio running instances, applications and space. Follow the instructions in this page.
Summary
The confluence of powerful large language models like IBM’s open-source Granite Code family and cloud platforms like Amazon SageMaker is a game-changer for accelerating legacy application modernization and cloud migration initiatives. No longer do enterprises need to rely solely on costly manual processes – AI can now shoulder much of the burden through automated code analysis, translation, refactoring and testing. Of course, human oversight and validation is still critical, especially for mission-critical workloads. But by leveraging the latest advances in generative AI for code, enterprises can streamline and de-risk modernization efforts while freeing up developer resources for higher value activities.
Visit the AWS Marketplace for IBM watsonx and other Data and AI solutions on AWS:
- IBM watsonx.governance as a Service on AWS
- IBM watsonx.data as a Service on AWS
- IBM watsonx.ai
- IBM watsonx Assistant
- IBM watsonx Orchestrate
- IBM Db2 Warehouse as a Service
- IBM Netezza Performance Server as a Service
Additional Content:
- IBM on AWS Partner Page
- IBM Granite Code Models
- Deploy Hugging Face Models on SageMaker
- Watsonx.governance: Monitor AI models with Amazon SageMaker
- Accelerate Data Modernization and AI with IBM Databases on AWS
- Making Data-Driven Decisions with IBM watsonx.data, an Open Data Lakehouse on AWS
- IBM watsonx.data on AWS
- IBM watsonx.ai on AWS
- IBM watsonx.data Lakehouse Integrations with Vector DB
- AI Governance with IBM watsonx.governance and Amazon SageMaker
- SageMaker Python SDK
- Visualize HuggingFace Models