AWS HPC Blog
Using the ParallelCluster 3 Configuration Converter
In September of 2021, we announced the release of AWS ParallelCluster 3, a major release with several changes and a lot of new features. To help get you started migrating your clusters, we provided the Moving from AWS ParallelCluster 2.x to 3.x guide. One of the key changes is that the configuration is now expressed using YAML instead of the INI syntax. Migrating from ParallelCluster version 2 to version 3 will require changing your configuration file to adapt to the new syntax.
To help with this, we’ve created a config converter tool which is part of the ParallelCluster (>= v3.0.1) command line interface (CLI).
This post provides you with an overview of the tool to get started.
Availability & Usage
This config converter tool is available in the standard executable path once ParallelCluster is installed. It can be invoked using the pcluster3-config-converter command. The tool takes a ParallelCluster 2 configuration file as an input and outputs a ParallelCluster 3 configuration file.
The following command line provides an example of how to use the tool to convert a ParallelCluster version 2 configuration file to a ParallelCluster 3 configuration file:
pcluster3-config-converter \
--config-file <ParallelCluster 2 config file> \
--output-file <ParallelCluster 3 config file>
The tool manages transforming the parameter specifications taking into consideration the functional feature differences between ParallelCluster 2 and ParallelCluster 3. It provides verbose messages to indicate these differences with messages that are informational, warnings or errors.
An example
The table below shows two configuration file samples where a ParallelCluster 2 configuration file has been converted to a ParallelCluster 3 configuration file by the configuration converter tool.
ParallelCluster version 2 | ParallelCluster version |
|
|
ParallelCluster 2 configuration files used pointers in the configuration file to point to various sections. The main section that defined the cluster components was the [cluster]
section. This section contained configuration settings for the head node as well as pointers to the scheduler queue sections which further contained pointers to the compute resources.
One cluster at a time
Among that sea of pointers, ParallelCluster 2 allowed you to define multiple clusters inside one config file. In contrast, ParallelCluster 3 has a distinct HeadNode
and a Scheduler
section which only contain configurations that apply to the head node and the queues, and only for one cluster per config file.
The Scheduler
queues section also includes the compute resources definition of those queues. While converting from version 2 to version 3, the tool needs to read the [cluster]
section that you want translated to the new format for ParallelCluster version 3. You can direct the tool to reference the desired [cluster]
section within the configuration file using the --cluster-template
subcommand and specify the name of a cluster section. If you don’t use this subcommand the default behavior of the tool is to look for the cluster_template
parameter in the [global]
section or search for ‘[cluster default]
’.
Shared Storage
You’ll notice in our configuration file example that there’s an additional SharedStorage
section created by the configuration tool for version 3 of the configuration file. This section isn’t present in the Version 2 configuration file. This accommodates a functional difference between ParallelCluster 2 and 3 regarding default Shared EBS volumes.
In ParallelCluster 2 – a default EBS volume mounted at /shared
was created if no other shared Amazon EBS volumes were specified. ParallelCluster 3 doesn’t define a default shared space, so you’ll need to explicitly define your shared storage configurations when you migrate. If no EBS volume is defined on the version 2 of the configuration file, the conversion tool will assume that there must be a default EBS volume on your currently deployed cluster. So, to maintain parity on final deployed configurations, it’ll add a SharedStorage
section where it defines an EBS volume shared at /shared
.
We recommend you review this addition before using the new version 3 configuration file for deploying a cluster.
Conclusion
Migrating an AWS ParallelCluster definition from version 2 to version 3 is a relatively straight forward process. The conceptual components remain the same with some changes in the organization of the configuration file. To reduce the burden of manually translating all parts of a version 2 config to version 3, we’ve introduced the config converter tool described here. Using this tool, you’ll be able to accelerate your migration by quickly creating the ParallelCluster version 3 configuration for your existing setup.
While the tool accurately converts to a ParallelCluster version 3 specification, make sure you review the new configuration file before using it to deploy a cluster.
For more information about AWS ParallelCluster 3 configuration specifications, check out the official documentation. You can also find more on the converter tool itself in the official documentation. We also have several videos explaining ParallelCluster in the HPC Tech Shorts channel.