AWS Compute Blog
Visualizing AWS Step Functions workflows from the AWS Batch console
This post written by Dhiraj Mahapatro, Senior Specialist SA, Serverless.
AWS Step Functions is a low-code visual workflow service used to orchestrate AWS services, automate business processes, and build serverless applications. Step Functions workflows manage failures, retries, parallelization, service integrations, and observability so builders can focus on business logic.
AWS Batch is one of the service integrations that are available for Step Functions. AWS Batch enables users to more easily and efficiently run hundreds of thousands of batch computing jobs on AWS. AWS Batch dynamically provisions the optimal quantity and compute resource classifications based on the volume and specific resource requirements of the batch jobs submitted. AWS Batch plans, schedules, and runs batch computing workloads across the full range of AWS compute services and features, such as AWS Fargate, Amazon EC2, and spot instances.
Now, Step Functions is available to AWS Batch users through the AWS Batch console. This feature enables AWS Batch users to augment compute options and have additional orchestration capabilities to manage their batch jobs.
This blog walks through Step Functions integration in AWS Batch console and shows how AWS Batch users can efficiently use Step Functions workflow orchestrators in batch workloads. A sample application also highlights the use of AWS Lambda as a compute option for AWS Batch.
Introducing workflow orchestration in AWS Batch console
Today, AWS users use AWS Batch for high performance computing, post-trade analytics, fraud surveillance, screening, DNA sequencing, and more. AWS Batch minimizes human error, increases speed and accuracy, and reduces costs with automation so that users can refocus on evolving the business.
In addition to using compute-intensive tasks, users sometimes need Lambda for simpler, less intense processing. Users also want to combine the two in a single business process that is scalable and repeatable.
Workflow orchestration (powered by Step Functions) in AWS Batch console allows orchestration of batch jobs with Step Functions state machine:
Using batch-related patterns from Step Functions
Error handling
Step Functions natively handles errors and retries of its workflows. Users rely on this native error handling mechanism to focus on building business logic.
Workflow orchestration in AWS Batch console provides common batch-related patterns that are present in Step Functions. Handling errors while submitting batch jobs in Step Functions is one of them.
- Choose Get Started from Handle complex errors.
- From the pop-up, choose Start from a template and choose Continue.
A new browser tab opens with Step Functions Workflow Studio. The Workflow Studio designer has a workflow pattern template pre-created. Diving deeper into the workflow highlights that the Step Functions workflow submits a batch job and then handles success and error scenarios by sending Amazon SNS notifications, respectively.
Alternatively, choosing Deploy a sample project from the Get Started pop-up deploys a sample Step Functions workflow.
This option allows creating a state machine from scratch, reviewing the workflow definition, deploying an AWS CloudFormation stack, and running the workflow in Step Functions console.
Once deployed, the state machine is visible in the Step Functions console as:
Select the BatchJobNotificationStateMachine to land on the details page:
The CloudFormation template has already provisioned the required batch job in AWS Batch and the SNS topic for success and failure notification.
To see the Step Functions workflow in action, use Start execution. Keep the optional name and input as is and choose Start execution:
The state machine completes the tasks successfully by Submitting Batch Job using AWS Batch and Notifying Success using the SNS topic:
The state machine used the AWS Batch Submit Job task. The Workflow orchestration in AWS Batch console now highlights this newly created Step Functions state machine:
Therefore, any state machine that uses this task in Step Functions for this account is listed here as a state machine that orchestrates batch jobs.
Combine Batch and Lambda
Another pattern to use in Step Functions is the combination of Lambda and batch job.
Select Get Started from Combine Batch and Lambda pop-up followed by Start from a template and Continue. This takes the user to Step Functions Workflow studio with the following pattern. The Lambda task generates input for the subsequent batch job task. Submit Batch Job task takes the input and submits the batch job:
Step Functions enables AWS Batch users to combine Batch and Lambda functions to optimize compute spend while using the power of the different compute choices.
Fan out to multiple Batch jobs
In addition to error handling and combining Lambda with AWS Batch jobs, a user can fan out multiple batch jobs using Step Functions’ map state. Map state in Step Functions provides dynamic parallelism.
With dynamic parallelism, a user can submit multiple batch jobs based on a collection of batch job input data. With visibility to each iteration’s input and output, users can easily navigate and troubleshoot in case of failure.
AWS Batch users are not limited to the previous three patterns shown in Workflow orchestration in the AWS Batch console. AWS Batch users can start from scratch and build Step Functions state machine by navigating to the bottom right and using Create state machine:
Create State Machine in AWS Batch console opens a new tab with Step Functions console’s Create state machine page.
Refer building a state machine AWS Step Functions Workflow Studio for additional details.
Deploying the application
The sample application shows fan out to multiple batch jobs pattern. Before deploying the application, you need:
- An AWS account (sign up for an account if you don’t have one).
- The latest version of AWS SAM CLI installed.
- Node.js installed (version 14 minimum).
To deploy:
- From a terminal window, clone the GitHub repo:
git clone git@github.com:aws-samples/serverless-batch-job-workflow.git
- Change directory:
cd ./serverless-batch-job-workflow
- Download and install dependencies:
sam build
- Deploy the application to your AWS account:
sam deploy --guided
To run the application using the AWS CLI, replace the state machine ARN from the output of deployment steps:
aws stepfunctions start-execution \
--state-machine-arn <StepFunctionArnHere> \
--region <RegionWhereApplicationDeployed> \
--input "{}"
Step Functions is not limited to AWS Batch’s Submit Job API action
In September 2021, Step Functions announced integration support for 200 AWS Services to enable easier workflow automation. With this announcement, Step Functions is not limited to integrate with AWS Batch’s SubmitJob API but also can integrate with any AWS Batch SDK API today.
Step Functions can automate the lifecycle of an AWS Batch job, starting from creating a compute environment, creating job queues, registering job definitions, submitting a job, and finally cleaning up.
Other AWS service integrations
Step Functions support for 200 AWS Services equates integration with more than 9,000 API actions across these services. AWS Batch tasks in Step Functions can evolve by integrating with available services in the workflow for their pre- and post-processing needs.
For example, batch job input data sanitization can be done inside Lambda and that gets pushed to an Amazon SQS queue or Amazon S3 as an object for auditability purposes.
Similarly, Amazon SNS, Amazon Pinpoint, or Amazon SES can notify once AWS Batch job task is complete.
There are multiple ways to decorate around an AWS Batch job task. Refer to AWS SDK service integrations and optimized integrations for Step Functions for additional details.
Important considerations
Workflow orchestrations in the AWS Batch console only show Step Functions state machines that use AWS Batch’s Submit Job task. Step Functions state machines do not show in the AWS Batch console when:
- A state machine uses any other AWS SDK Batch API integration task
- AWS Batch’s SubmitJob API is invoked inside a Lambda function task using an AWS SDK client (like Boto3 or Node.js or Java)
Cleanup
The sample application provisions AWS Batch (the job definition, job queue, and ECS compute environment inside a VPC). It also creates subnets, route tables, and an internet gateway. Clean up the stack after testing the application to avoid the ongoing cost of running these services.
To delete the sample application stack, use the latest version of AWS SAM CLI and run:
sam delete
Conclusion
To learn more on AWS Batch, read the Orchestrating Batch jobs section in the Batch developer guide.
To get started, open the workflow orchestration page in the Batch console. Select Orchestrate Batch jobs with Step Functions Workflows to deploy a sample project, if you are new to Step Functions.
This feature is available in all Regions where both Step Functions and AWS Batch are available. View the AWS Regions table for details.
To learn more on Step Functions patterns, visit Serverless Land.