AWS Spatial Computing Blog
Deploying NVIDIA Omniverse Nucleus on Amazon EC2
Introduction
This post aims to get users up and running with NVIDIA Omniverse Enterprise Nucleus Server on Amazon Elastic Compute Cloud (Amazon EC2). Here I’ll outline the requirements for Enterprise Nucleus Server deployment and dive deep into the technical steps for getting Nucleus running in your Amazon Web Services (AWS) account.
What is Omniverse?
NVIDIA Omniverse is a scalable, multi-GPU, real-time platform for building and operating metaverse applications, based on Pixar’s Universal Scene Description (USD) and NVIDIA RTX technology.
NVIDIA Omniverse Nucleus is the database and collaboration engine of Omniverse. With Omniverse Nucleus, teams can have multiple live users connected using different applications at once. This allows people to use the application they are most comfortable with and opens a lot of doors for rapid iteration. Learn more on the NVIDIA Omniverse Introduction Documentation.
Nucleus operates under a publish-and-subscribe model and enables efficient live synchronization between NVIDIA Omniverse applications. Changes to USD scenes are transmitted in real-time between connected Omniverse clients. Clients connect using the publish-and-subscribe pattern, which makes it possible for them to receive changes submitted in near real-time.
Other Nucleus features include users and group management, assets access control lists (ACLs) for fine-grained access control, versioning with checkpoints, single sign-on (SSO) with SAML Authentication, and TLS encryption support.
Why AWS?
There are multiple reasons to deploy Nucleus on the AWS Global Cloud Infrastructure. With AWS you can connect distributed users all over the globe. Our security, identity, and access management controls allow you to retain complete control over your data. Also, with the variety of compute instance types and storage solutions AWS offers, you can right size your infrastructure and fine tune performance as needed.
Solution Overview
The following steps outline a solution that implements the basic components of a Nucleus deployment. To handle communication from end users an Amazon EC2 instance configured as a NGINX reverse proxy is deployed in a public subnet. The reverse proxy accepts TLS traffic and has a TLS certificate from Amazon Certificates Manager (ACM). Typically, this component would be an Elastic Load Balancer (ELB), but the Nucleus Server requires path rewrites in the request which is not currently supported by an ELB.
The Enterprise Nucleus Server is an Amazon EC2 instance deployed to a private subnet that only accepts traffic from the reverse proxy subnet. The Enterprise Nucleus Server is running the Nucleus Enterprise Stack, which is deployed as a Docker Compose Stack. The Nucleus instance will need a NAT Gateway and Internet Gateway to communicate with the NVIDIA NGC. This procedure uses the basic Nucleus stack with TLS support, and not SSO.
Prerequisites
- AWS Command Line Interface (CLI) – Installing or updating the latest version of the AWS CLI
- AWS Cloud Development Kit (CDK) – Install the AWS CDK
- Python 3.9 or greater – Python Downloads
- NVIDIA Enterprise Omniverse Nucleus packages – Enterprise Nucleus Server Quick Start Tips
- Nitro Enclaves Marketplace Subscription – AWS Certificate Manager for Nitro Enclaves
Deploying Omniverse Nucleus on Amazon EC2
Register a domain and create a hosted zone with Amazon Route 53
First, you will need a hosted zone and a domain for the Nucleus Server. Amazon Route 53 (Route 53) allows registration of a domain, such as my-omniverse.com, and creation of a subdomain, such as nucleus.my-omniverse.com, for the Nucleus Server. When registering a domain, communication occurs with the domain registrar. It is best to do this step manually and then reference the Hosted Zone ID, created by Route 53, in the subsequent configuration steps.
See this page for more information on registering a domain and creating a hosted zone: Registering a new domain.
Configure the CDK Stack
Next, you will configure a CDK stack with basic resources for the Nucleus deployment.
Step 1. Open a terminal and create a project folder for your CDK app
The name of the folder will become the application name. For this procedure, nucleus-app is the name used.
Step 2. Change directory into the folder created in Step 1 and initialize your CDK app with the following command:
Now your project structure should be the following:
nucleus-app.ts
is the main entry point for the app and the file that subsequent CDK commands will reference. When viewing this file, you can see it imports lib/nucleus-app-stack.ts
, which is where you’ll put custom code for your deployment.
Step 3. Run a basic “Hello World” test
Deploy the starter CDK stack with cdk deploy
. This will produce a basic stack and confirm your CLI and CDK are properly configured.
Step 4. Set default account and AWS Region environment values
Open bin/nucleus-app.ts
and set the default account and Region environment (env) values. The contents of the file should look like the following:
Step 5. Remove sample resources
Open lib/nucleus-app-stack.ts
and remove the sample Amazon Simple Queue Service (SQS) and Amazon Simple Notification Service (SNS) resources. Your file should now look like the following:
Step 6. Add the below CDK libraries, as these are required in subsequent steps
Define Stack Resources
Next, you will define custom infrastructure resources required for the deployment. Code samples in this section need to be added inside the constructor of the NucleusAppStack class
Step 1. Create an Amazon Simple Storage Service (Amazon S3) bucket for artifacts
First, create a simple Amazon S3 bucket that will be used to transfer artifacts from our local client to Amazon EC2 instances. As per security best practices, enable encryption, enforce SSL, and block public access. Then create an AWS Identity and Access Management (IAM) policy that allows access to list bucket and get objects from the bucket. This policy will be attached to our Amazon EC2 instance profile role.
Step 2. Add an Amazon Virtual Private Cloud (VPC) configuration
Specify the private subnet that contains the NAT gateway with a route to the internet. Then provision two security groups that the proxy server and the Nucleus Server will use.
Step 3. Add security group ingress rules
Configure the proxy and Nucleus security groups to allow traffic on required ports. The Nucleus security group only allows traffic from the proxy security group. The proxy security group allows traffic from a specific CIDR range. You’ll want to set this to a range you will use to connect to the server. For example, you can use the IP address of the client machine you plan to connect from. Then, you enter that IP appended with a network mask as the CIDR range. For this solution, the recommended network mask is /32.
Step 4. Add TLS Certificate and set the domain from Step 1 for validation
Note: the root-domain variable must be set to the domain registered with the Route 53 hosted zone from Step 1.
Note: Currently there is no additional management of this CNAME record. Meaning when you no longer require it, you’ll have to remove it manually from your Route 53 Hosted Zone.
Step 5. Add reverse proxy resources
For the reverse proxy, configure it with Nitro Enclaves enabled. Enclaves provides features to create isolated compute environments to protect and securely process highly sensitive data. In this case that’s our TLS certificate. On top of that, Nitro Enclaves has support for integration with Amazon Certificates Manager. This means Certificates Manager can automatically handle the rotation of the certificate. For more information, see AWS Nitro Enclaves User Guide.
Starting from the Certificate Manager for Nitro Enclaves AMI, create a c5.xlarge instance with 32GB of storage. In this case c5.xlarge was chosen as one of the smallest available instances required for the Nitro Enclaves AMI. Configure a basic instance role with the AmazonSSMManagedInstanceCore policy. This allows you to connect to the instance with AWS Systems Manager (SSM) and avoid opening the instance to SSH traffic over the internet.
Finally, attach a “dummy” IAM policy to the reverse proxy. This is an empty policy which will get updated with the configuration scripts.
Note, if your Region is not in the list of Regions below, review the AMI listing on the AWS Marketplace AWS Certificate Manager for Nitro Enclaves or the AWS Documentation for finding the correct AMI ID, Finding AMI IDs.
Step 6. Add Nucleus Server resources
Next, configure the Nucleus Server. Start with the Ubuntu, 20.04 LTS AMI with c5.4xlarge as the instance type. C5 instances are optimized for compute-intensive workloads and deliver cost-effective high performance at a low price per compute ratio. The instance has 16 vCPUs and 32GB of Memory. An Amazon Elastic Block Store (EBS) volume is attached to the instance with 512GB of storage. These specs were chosen to be sufficiently large for a proof of concept.
The instance user data script is configured to install docker, docker-compose, and the AWS CLI.
Step 7. Configure stack outputs
Next, add output values so you can easily reference them later.
Step 8. Deploy the stack
Once this is complete, you will have the basic resources required and next you will configure them.
If you encounter the following CDK deploy error:
Check that you have the correct domain specified and that your hosted zone exists in the Route 53 console Route 53 Hosted zones.
Step 9. Note the stack output values. You’ll use them in the future
Configure The Reverse Proxy Server
Step 1. Associate Enclave certificate with proxy instance IAM role
The first thing you have to do with the reverse proxy is associate your certificate with the IAM role that the Nitro Enclave uses. In the following code, please replace tls-certificate-arn
, proxy-instance-role-arn
, proxy-cert-association-policy-arn
, and region
in the below script with stack output values from above.
Note: The following script was written in Python 3.9. If you have issues with conflicting python versions. It’s recommended that you set a local virtualenv
. For more information, see Python Tutorial Virtual Environments and Packages.
This script associates an identity and IAM role with an AWS Certificate Manager (ACM) certificate. This enables the certificate to be used by the ACM for Nitro Enclaves application inside an enclave. For more information, see Certificate Manager for Nitro Enclaves in the Amazon Web Services Nitro Enclaves User Guide. The script then updates the IAM role policy with permissions to get its own role, download the certificate, and decrypt it.
Save the script to a file and run it from the terminal:
Step 2. Configure Nginx conf
NVIDIA provides a sample Nginx config for the Nucleus deployment. It is packaged within a provided archive file. At the time of writing this, the latest is nucleus-stack-2022.4.0+tag-2022.4.0-rc.1.gitlab.6522377.48333833.tar.gz
Open the archive and look for: ssl/nginx.ingress.router.conf
This file needs to be updated and then placed at /etc/nginx/conf.d/nginx.conf
on the reverse proxy instance.
First, you need to update the config with configuration outlined in the AWS Certificate Manager for Nitro Enclaves guide: Nitro Enclaves application: AWS Certificate Manager for Nitro Enclaves.
At the top of the file, in the main context add the following:
After the line, # Configure your SSL options as required by your security practices, add the below snippet:
Next, update the config file with the Nucleus Server private DNS address and the fully qualified domain for your server. Replace instances of my-ssl-nucleus.my-company.com
with your domain. Then, replace instances of BASE_STACK_IP_OR_HOST
with the nucleusServerPrivateDnsName
from the stack outputs above.
Step 3. Copy the .conf file to Amazon S3
Step 4. Connect to the proxy instance
From your web browser, navigate to the EC2 Dashboard in the AWS Console, select the Nucleus-ReverseProxy instance, and press the Connect button.
Select the Session Manager tab, then press the Connect button.
Step 5. In the terminal, copy the nginx.conf
file path from Amazon S3 to /etc/nginx/
Step 6. While still in the proxy server terminal, rename the sample ACM for Nitro Enclaves configuration file from /etc/nitro_enclaves/acm.example.yaml
to /etc/nitro_enclaves/acm.yaml
using the following command:
Step 7. Update the acm.yaml certificate_arn
value
Using your preferred text editor, open /etc/nitro_enclaves/acm.yaml
. In the ACM section, update certificate_arn
, with the ARN of the certificate from our stack. This is the tls-certificate-arn
from the stack outputs above. Save and close the file.
Step 8. Start the Nginx server
Step 9. Confirm the server is accepting TLS requests to your domain
You’ll see a generic HTML template as output.
Configure Nucleus Server
Much of the following comes from NVIDIA’s documentation on deploying a Nucleus Server. Review these docs for more information: Enterprise Nucleus Server Quick Start Tips.
Step 1. From your local computer using the AWS CLI, copy the Nucleus Stack archive to Amazon S3
Step 2. Connect to the Nucleus Server with EC2 Session Manager
With your web browser, navigate to the EC2 Dashboard in the AWS Console, select the Nucleus-Server instances, press the Connect button, and then press the Connect button again on the Session Manager tab.
Step 3. In the Nucleus-Server terminal, change directory to the home directory, and then copy the Nucleus Stack from S3
Step 4. Unpack the archive to an appropriate directory, then cd
into that directory
Step 5. Update nucleus-stack.env
With your preferred text editor, review the nucleus-stack.env file. It is recommended that you review this file in its entirety. You will use this file to confirm that you accept the NVIDIA Omniverse end user license agreement.
Then update the following nucleus-stack.env
variables as needed
Step 6. Generate secrets required for authentication
Note the following is required because you are not using SSO integration at this time. See the security notes in nucleus-stack.env for more information.
Step 7. Pull the Nucleus docker images
sudo docker-compose –env-file ${omniverse_root}/base_stack/nucleus-stack.env -f ${omniverse_root}/base_stack/nucleus-stack-ssl.yml pull
Step 8. Start the Nucleus stack
Usage
Back on your local machine, test a connection to your Nucleus Server by pointing your web browser to the domain you specified in the .env
file. You should be greeted with the following login dialog:
Here you can use the Master or Service Username and Password configured in the nucleus-stack.env
, or press Create Account. Then you’ll be presented with a navigator view of your Nucleus Server content
Cleanup
Step 1. Disassociate the Nitro Enclave certificate by running the dissacociate_enclave_cert.py script
Step 2. Delete the stack by running cdk desktroy
from the nucleus-app application folder.
Conclusion
This post provides the basics to get up and running with NVIDIA Omniverse Nucleus on Amazon EC2 using the Docker Compose container. This post walked through the setup procedures of the Amazon EC2 Nucleus and reverse proxy servers, implemented S3 for storage and retrieval of configuration files, and Route 53 private hosted zones for secure, private access to your Omniverse data.
This deployment of Nucleus on Amazon EC2 allows your teams, no matter where they are located, to collaborate and interact in real-time while building 3D products, applications, and experiences.
To learn more about spatial computing at AWS, continue following along here on the Spatial Computing Blog channel.
Additional Reading
This information may also be found on the AWS GitHub repository, NVIDIA Omniverse Nucleus on Amazon EC2.
AWS Services
Amazon EC2, secure and resizable compute capacity for virtually any workload
Amazon Route 53, a reliable and cost-effective way to route end users to Internet applications
Amazon S3, object storage built to retrieve any amount of data from anywhere
Amazon EBS, easy to use, high performance block storage at any scale
AWS Certificates Manager for Nitro Enclaves, public and private TLS certificates with web servers running on Amazon EC2 instances