AWS HPC Blog
Deep dive into the AWS ParallelCluster 3 configuration file
In September, we announced the release of AWS ParallelCluster 3, a major release with lots of changes and new features. To help get you started migrating your clusters, we provided the Moving from AWS ParallelCluster 2.x to 3.x guide. We know moving versions can be a quite an undertaking, so we’re augmenting that official documentation with additional color and context on a few key areas. With this blog post, we’ll focus on the configuration file format changes for ParallelCluster 3, and how they map back to the same configuration sections for ParallelCluster 2.
The AWS ParallelCluster 3 configuration file
The first major change to discuss is that AWS ParallelCluster 3 restricts a configuration file to define a single cluster resource. Previously you were able to define multiple cluster configurations within the same configuration file, and then provide an option to the command line interface (CLI) to specify which cluster you are operating on. ParallelCluster 3 CLI instead asks you to provide the configuration file for the cluster resource you want to operate on.
We believe that associating a configuration file with a single cluster (along with some other changes we’ll discuss later), will make each file more readable and maintainable in the long run.
With this in mind, when you migrate a ParallelCluster 2 configuration file that defines multiple clusters to version 3, you’ll need to create those individual configuration files for each cluster. Any resource setting that is referenced from more than one cluster definition will need to be repeated in each destination configuration file.
Introducing the ParallelCluster Configuration Converter
To help you transform your ParallelCluster configuration file from version 2 to version 3 specification, we have introduced a configuration converter tool, which is available in ParallelCluster 3.0.1. This tool, takes a ParallelCluster 2 configuration file as input and outputs a ParallelCluster 3 configuration file. It manages the transformation of various parameter specifications while considering functional feature differences between ParallelCluster 2 and ParallelCluster 3. It provides verbose messages to highlight these differences with additional information, warnings, or error messages. There’s more on the tool in the online documentation. We’ll discuss the specifics of the configuration file changes later in this post, but this tool will help you when you are ready to migrate. In line with ParallelCluster 3’s approach of one cluster per config file, the config converter will migrate one cluster section at a time for you, as specified (by you) on the command line using the ‘–cluster-template’ option.
Syntax changes
The next major thing you will notice is that the configuration file is now using YAML instead of the INI syntax. We think this improves readability and maintainability by collecting resource types under a tree structure.
To better understand the differences between ParallelCluster version 2 and 3, we will break down the analysis into the following high-level components of a cluster: Head Node, Scheduler and Compute Nodes, Storage, and Networking. Note that while these examples are not exhaustive, they cover the most important options and changes to give you a good sense of what to look for when you migrate your own configuration files.
A note on inclusive language
You’ll also have noticed that we have started using the term “head node” in lieu of “master node”. The language we use and what we choose to name things reflect our core values. For the past couple of years, it’s been a goal of ours to change some problematic language for cluster resources. The scope of what we wanted to accomplish for version 3 presented us with a golden opportunity to finally make changes that break from such traditional non-inclusive naming.
Across the entire product, we no longer refer to a ‘master node’, but instead to a ‘head node’ (and that extends to names for environment variables like MASTER_IP
, which is now PCLUSTER_HEAD_NODE_IP
).
Configuration file sections
The HeadNode section
The following table lists the configuration options for a cluster head node, and contrasts the two configuration file formats with ParallelCluster 2 and ParallelCluster 3 side-by-side.
AWS ParallelCluster version 2 | AWS ParallelCluster version 3 |
---|---|
|
|
Notice that ParallelCluster 2’s [cluster]
section contains configuration settings for the head node, compute nodes, and scheduler within the same section, yet it splits the SSH ingress rule and key pair name across the [vpc]
and [cluster]
sections, respectively. In contrast, ParallelCluster 3 has a distinct HeadNode
section which only contains settings that relate to the head node and does not contain any information about the compute nodes or scheduler. Also note that the ParallelCluster 3 version only asks for the subnet to deploy to, since the VPC can be inferred from that.
Another practice we’re leaving behind is ParallelCluster 2’s use of ad hoc pointers in configuration files. Sections that needed to refer to a resource that was defined in another section of the file had an attribute where the attribute name was prefixed with the type of resource (“vpc
” or “queue
”) and ended with a “_settings
” suffix. The value was a “pointer” to another section in the configuration. In our example, the vpc_settings = public
attribute pointed the [vpc public]
section. When the concept is simple, it’s a good methodology to use and it’s a common pattern for INI files. But maintenance and understanding became more difficult once there were a lot of sections being referenced, each of which had pointers of their own to other sections. While a ParallelCluster itself didn’t lose track of all these pointer references, humans did. This was the case for defining scheduler queues, which we’ll talk about in the next section.
There are many more configuration options in the HeadNode section, some of which are like ParallelCluster 2 properties. You’ll see more on this in the HeadNode section of the documentation. One new capability not shown in the previous example is the ability to set IAM permissions specific to the head node, separate from the compute nodes.
Scheduling and ComputeResources sections
A common pattern for cluster configuration files (and a great way to use the cloud) is to define multiple queues with different underlying compute resources. In ParallelCluster 2, you defined a [cluster] section with pointers to one or more [queue] sections. Each [queue] section had more pointers to the [compute_resource] sections, which could overlap with other defined queue sections! If you made a change to a [compute_resource], you may introduce unwanted changes to another [queue] section.
ParallelCluster 3 configuration files avoid this problem by providing a hierarchy of resources. A Scheduling section contains a set of queues, where each queue contains the ComputeResource definitions. The following table shows an example of a Slurm cluster that defines multiple queues, and again shows the version 2 and 3 definitions side by side:
AWS ParallelCluster version 2 | AWS ParallelCluster version 3 |
---|---|
|
|
The new format defines a Scheduling section which allows for one or more queues (in this case, SlurmQueues) to be defined within that section. Each queue section defines the ComputeResources in a self-contained child structure. This may cause a little redundancy (like repeating the subnet information) if compute resources have the same structure across queues, but it enables for a resource definition to be consistent within itself, and thus easier to maintain over the long term.
Some notes on network configuration
It’s worth explaining a couple of things about networking in a little more detail.
First, in the previous example you saw that the each ComputeResources section required Networking/SubnetIds to be specified. This begs the question about whether you can define different subnets for difference compute resources. The answer is “no” – in ParallelCluster you still need to maintain the same subnet specification across all compute resources. You can’t provide different subnets for different queues (yet).
Next, there’s some detail to understand about including Elastic Fabric Adapter (EFA). You’ll recall that EFA is a network interface for Amazon EC2 instances for applications requiring high levels of inter-node communications at scale. In ParallelCluster 3 the specification for EFA is defined within a ComputeResources/Efa subsection for each queue that needs it. You do that by asserting ‘Enabled: true’. However, there’s one more step: you also need to specify whether you want to use a placement group for EFA, or not. It would be unusual not to use a cluster placement group with EFA, but we didn’t want the configuration syntax to exclude this choice. If you have a specific placemnt group you want to use, you can specify it, or ParallelCluster will create one for you.
AWS ParallelCluster version 2 | AWS ParallelCluster version 3 |
---|---|
|
|
Separating the head node from the compute nodes
There are some other sections that relate to compute resources, like specifying local instance storage for nodes, or defining a custom AMI for the compute nodes which is separate from the head node’s AMI. Two notable differences from ParallelCluster 2 are that in ParallelCluster 3 you can also define separate IAM permissions and custom bootstrap actions for the head node (see Iam, CustomActions) versus the compute nodes (see Iam, CustomActions).
SharedStorage section
In ParallelCluster 3, we aligned the storage options for instances (the ephemeral and root volumes) to the sections where the resource is defined (like the HeadNode, or the ComputeSettings for a specific queue).
Shared storage configuration settings are separated from these, under the SharedStorage section of the configuration file. There, you define up to five Amazon Elastic Block Store (Amazon EBS), one Amazon Elastic File System (Amazon EFS), and one Amazon FSx for Lustre file systems, which will be shared across all resources.
In contrast to ParallelCluster 2, where a default EBS volume (mounted at /shared) was always created if no other shared volumes were specified, ParallelCluster 3 doesn’t define a default shared volume – you need to explicitly define one. The Configuration Converter tool does explicitly define a shared volume when you use it to migrate your configuration file. If you don’t need it, you’re free to remove this, or alter it, before you launch your new cluster.
AWS ParallelCluster version 2 | AWS ParallelCluster version 3 |
---|---|
|
|
Finally, in ParallelCluster 2 you defined IAM permissions for access to Amazon Simple Storage Service (Amazon S3) buckets that applied to your head nodes and compute nodes. In ParallelCluster 3, you’re free to define separate access rules for each resource type. Refer to the S3Access documentation for the HeadNode and Scheduling sections (for the compute fleets) for more on this.
Conclusion
In this post, we took a deep dive into parts of the ParallelCluster 3 configuration file, and how it differs from the previous version. We explained the hierarchical arrangement of resources, as well as key parts of the configured resources (head node, queues, compute resources, storage, etc.) and how they all fit together in ParallelCluster 3 configurations.
Migrating a ParallelCluster 2 cluster definition to ParallelCluster 3 can be a relatively straight forward process, as the conceptual components remain the same with some changes in organization of these components in a ParallelCluster configuration file. The YAML formatting is simple and the hierarchical organization makes for a more intuitive organization that’s easier to read and maintain.
To help reduce the burden of manually translating parts of a ParallelCluster 2 configuration to ParallelCluster 3, we’ve developed a tool to transform your configuration files from version 2 to version 3, which is available starting in ParallelCluster 3.0.1. You can find more details about the tool in the online documentation.
For getting started with AWS ParallelCluster 3, you can follow one of our step-by-step workshops, or watch an HPC tech Short.