Containers

Enabling mTLS in AWS App Mesh using SPIFFE/SPIRE in a multi-account Amazon EKS environment

NOTICE: October 04, 2024 – This post no longer reflects the best guidance for configuring a service mesh with Amazon ECS and Amazon EKS, and its examples no longer work as shown. For workloads running on Amazon ECS, please refer to newer content on Amazon ECS Service Connect, and for workloads running on Amazon EKS, please refer to Amazon VPC Lattice.

——–

Over the past few years, companies and organizations have been adopting microservice-based architectures to drive their businesses forward with a rapid pace of innovation. Moving to microservices brings several benefits in terms of modularity and deployment speed, but it also adds additional complexity that requires establishing higher security postures. For distributed applications spanning multiple, potentially untrusted networks, it is necessary to implement a zero-trust security policy that considers any source as potentially malicious.

Service mesh solutions like AWS App Mesh can help you manage microservice-based environments by facilitating application-level traffic management, providing consistent observability tooling, and enabling enhanced security configurations. Within the context of the shared responsibility model, you can use AWS App Mesh to fine-tune your security posture based on your specific needs. For example, you may have security and compliance baselines that require you to encrypt all inter-service communications. In this case, AWS App Mesh can help by encrypting all requests between services using Transport Layer Security (TLS) and Mutual TLS authentication (mTLS). Mutual TLS adds an additional layer of security over standard TLS, using asymmetric encryption to verify the identity of both the server and the client. It also ensures that data hasn’t been viewed or modified in transit.

AWS App Mesh uses a popular open-source service proxy called Envoy to provide fully managed, highly available service-to-service communication. Envoy’s Secret Discovery Service (SDS) allows you to bring-your-own sidecars that can send certificates to Envoy proxies for mTLS authentication. SPIFFE, the Secure Production Identity Framework for Everyone, is a set of open-source standards that software systems can adopt to mutually authenticate in complex environments. SPIRE, the SPIFFE runtime environment, is an open-source toolchain that implements the SPIFFE specification. SPIRE agents use Envoy’s SDS to provide Envoy proxies with the necessary key material for mTLS authentication. The following diagram provides a high-level overview of how mTLS authentication takes place using SPIRE:

diagram of mTLS authentication using SPIRE

  • SPIRE agent nodes and workloads running on these agent nodes are registered to a SPIRE server using the registration API.
  • The SPIRE agent has native support for the Envoy Secret Discovery Service (SDS). SDS is served over the same Unix domain socket as the workload API and Envoy processes connecting to SDS are attested as workloads.
  • Envoy uses SDS to retrieve and maintain updated “secrets” from SDS providers. In the context of TLS authentication, these secrets are the TLS certificates, private keys, and trusted CA certificates.
  • The SPIRE agent can be configured as an SDS provider for Envoy, allowing it to provide Envoy with the key material it needs to provide TLS authentication directly.
  • The SPIRE agent will also take care of regenerating the short-lived keys and certificates as required.
  • When Envoy connects to the SDS server exposed by the SPIRE agent, the agent attests Envoy and determines which service identities and CA certificates it should make available to Envoy over SDS.
  • As service identities and CA certificates rotate, updates are streamed back to Envoy. Envoy can immediately apply them to new connections without interruption, downtime, or ever having private keys touch the disk.

Overview

Previous blog posts have demonstrated how to use mTLS in a single Amazon Elastic Kubernetes Service (Amazon EKS) cluster and how to leverage AWS App Mesh in a multi-account environment. The purpose of this blog post, however, is to combine these two ideas and demonstrate how to secure communications between microservices running across different Amazon EKS clusters in different AWS accounts. We’ll be using AWS App Mesh, AWS Transit Gateway, and SPIRE integration for mTLS authentication. The following diagram illustrates the multi-account environment that we will build in the following tutorial:
sample diagram of multi-account environment

  • We will use three Amazon EKS clusters, each in its own account and VPC. The network connectivity between these three clusters will be made through AWS Transit Gateway.
  • A single SPIRE server will be installed into the EKS cluster named eks-cluster-shared and SPIRE agents will be installed into the other two EKS clusters named eks-cluster-frontend and eks-cluster-backend.
  • We will use the AWS Resource Access Manager (AWS RAM) to share the mesh across all three accounts so it will be visible to all EKS clusters.
  • We will create AWS App Mesh components and deploy them using a sample application called Yelb that allows users to vote for their favorite restaurant.
  • Yelb components include:
    • The yelb-ui component, which is responsible for serving web artifacts to the browser, will be deployed in eks-cluster-frontend residing in our frontend account.
    • The yelb-appserver, redis, and postgres database will be deployed in eks-cluster-backend residing in our backend account.

Walkthrough

Prerequisites:

This tutorial assumes that you are using a bash shell. Accordingly, you will need to ensure that the following tools are installed:

  • AWS CLI
  • eksctl utility used for creating and managing Kubernetes clusters on Amazon EKS
  • kubectl utility used for communicating with the Kubernetes cluster API server
  • jq JSON processor
  • Helm CLI used for installing Helm Charts

Configure the AWS CLI:

Three named profiles are used with the AWS CLI throughout this tutorial to target command executions at different accounts. After you have identified the three accounts you will use, ensure that you configure the following named profiles with AdministratorAccess:

  • shared profile – the main account that will host the SPIRE server
  • frontend profile – the account that will host the frontend resources
  • backend profile – the account that will host the backend resources

Since these profiles are referenced in various commands and helper scripts throughout this tutorial, ensure they are named exactly as specified, otherwise certain commands will fail.

Your AWS CLI credentials and configurations should look like the following example snippet:

cat ~/.aws/credentials

[shared]
aws_access_key_id = ...
aws_secret_access_key = ...
.
[frontend]
aws_access_key_id = ...
aws_secret_access_key = ...

[backend]
aws_access_key_id = ...
aws_secret_access_key = ...

cat ~/.aws/config

[profile shared]
region = us-east-2
output = json

[profile frontend]
region = us-east-2
output = json

[profile backend]
region = us-east-2
output = json

Alternatively, you can configure the AWS CLI to use IAM roles as well.

Clone the GitHub repository:

# Using SSH:
git clone git@github.com:aws/aws-app-mesh-examples.git

# Using HTTPS 
git clone https://github.com/aws/aws-app-mesh-examples.git

cd aws-app-mesh-examples/blogs/eks-multi-account-spire

Deploying the AWS CloudFormation stacks:

Start by deploying the shared services AWS CloudFormation stack in the main account using the shared profile:

aws --profile shared cloudformation deploy \
--template-file cf-templates/shared-services-template.json \
--stack-name eks-cluster-shared-services-stack \
--capabilities CAPABILITY_NAMED_IAM \
--parameter-overrides \
FrontendAccountId=$(aws --profile frontend sts get-caller-identity | jq -r '.Account') \
BackendAccountId=$(aws --profile backend sts get-caller-identity | jq -r '.Account')

aws --profile shared cloudformation wait stack-create-complete \
--stack-name eks-cluster-shared-services-stack

This CloudFormation stack will create the following resources:

  • a new EKS cluster named eks-cluster-shared
  • a managed node group in a new VPC
  • a Transit Gateway named tgw-shared
  • a Transit Gateway attachment associated with managed node group VPC
  • an AWS RAM resource share for the transit gateway named multi-account-tgw-share
  • a node instance role that has permission to assume cross-account roles eks-cluster-frontend-access-role and eks-cluster-backend-access-role

Next, accept the RAM resource share for the frontend and backend accounts:

aws --profile frontend ram accept-resource-share-invitation \
--resource-share-invitation-arn $(aws --profile frontend ram get-resource-share-invitations \
| jq -r '.resourceShareInvitations[] | select(.resourceShareName=="multi-account-tgw-share") | .resourceShareInvitationArn')

aws --profile backend ram accept-resource-share-invitation \
--resource-share-invitation-arn $(aws --profile backend ram get-resource-share-invitations \
| jq -r '.resourceShareInvitations[] | select(.resourceShareName=="multi-account-tgw-share") | .resourceShareInvitationArn')

Note: This step is not necessary if you are using accounts that belong to the same AWS Organization and you have resource sharing enabled. Principals in your organization get access to shared resources without exchanging invitations.

Next, deploy the frontend CloudFormation stack in the account you have designated to host your frontend resources using the frontend profile:

aws --profile frontend cloudformation deploy \
--template-file cf-templates/frontend-template.json \
--stack-name eks-cluster-frontend-stack \
--capabilities CAPABILITY_NAMED_IAM \
--parameter-overrides \
TransitGatewayId=$(aws --profile shared ec2 describe-transit-gateways \
| jq -r '.TransitGateways[] | select(.Tags[].Value=="tgw-shared").TransitGatewayId') \
NodeInstanceRoleArn=$(aws --profile shared iam list-roles \
| jq -r '.Roles[] | select(.RoleName | contains("eks-cluster-shared-services-stack-NodeInstanceRole")).Arn')

This CloudFormation stack will create the following resources:

  • a new EKS cluster named eks-cluster-frontend
  • a managed node group in a new VPC
  • a Transit Gateway attachment associated with the managed node group VPC
  • a role named eks-cluster-frontend-access-role with a permissions policy that allows it to be assumed by the node instance role from the main account

Finally, deploy the backend CloudFormation stack in the account you have designated to host your backend resources using the backend profile:

aws --profile backend cloudformation deploy \
--template-file cf-templates/backend-template.json \
--stack-name eks-cluster-backend-stack \
--capabilities CAPABILITY_NAMED_IAM \
--parameter-overrides \
TransitGatewayId=$(aws --profile shared ec2 describe-transit-gateways \
| jq -r '.TransitGateways[] | select(.Tags[].Value=="tgw-shared").TransitGatewayId') \
NodeInstanceRoleArn=$(aws --profile shared iam list-roles \
| jq -r '.Roles[] | select(.RoleName | contains("eks-cluster-shared-services-stack-NodeInstanceRole")).Arn')

This CloudFormation stack will create the following resources:

  • a new EKS cluster named eks-cluster-backend
  • a managed node group in a new VPC
  • a Transit Gateway attachment associated with the managed node group VPC
  • a role named eks-cluster-backend-access-role with a permissions policy that allows it to be assumed by the node instance role from the main account

Update the kubectl contexts:

Now that the three EKS clusters are created, you will need to update your local ~/.kube/config file to allow kubectl to communicate with the different API servers. For this, eksctl provides a utility command that allows you to obtain cluster credentials:

eksctl --profile shared utils write-kubeconfig --cluster=eks-cluster-shared

eksctl --profile frontend utils write-kubeconfig --cluster=eks-cluster-frontend

eksctl --profile backend utils write-kubeconfig --cluster=eks-cluster-backend

By default, this command writes cluster credentials to your local ~/.kube/config file.

For convenience, make a series of aliases to reference the different cluster contexts:

echo export SHARED_CXT=$(kubectl config view -o json \
| jq -r '.contexts[] | select(.name | contains("eks-cluster-shared")).name') >> ~/.bash_profile
echo export FRONT_CXT=$(kubectl config view -o json \
| jq -r '.contexts[] | select(.name | contains("eks-cluster-frontend")).name') >> ~/.bash_profile

echo export BACK_CXT=$(kubectl config view -o json \
| jq -r '.contexts[] | select(.name | contains("eks-cluster-backend")).name') >> ~/.bash_profile

. ~/.bash_profile

As with the AWS CLI named profiles, these aliases are also referenced in various commands and helper scripts throughout this tutorial. Ensure that they are named exactly as specified, otherwise certain commands will fail.

Modify the aws-auth ConfigMaps

The aws-auth ConfigMap allows your nodes to join your cluster and is also used to add RBAC access to IAM users and roles. For this tutorial, the SPIRE server hosted in the main EKS cluster eks-cluster-shared requires authorization to get an authentication token for the frontend eks-cluster-frontend and backend eks-cluster-backend EKS clusters in order to verify the identities of the hosted SPIRE agents during node attestation. To accomplish this, the SPIRE server will assume cross-account IAM roles, and these roles should be added to the aws-auth ConfigMap of the frontend and backend EKS clusters.

Execute the following commands to edit the frontend aws-auth ConfigMap:

kubectl config use-context $FRONT_CXT

ACCOUNT_ID=$(aws --profile frontend sts get-caller-identity | jq -r .Account)

ROLE_ARN=arn:aws:iam::${ACCOUNT_ID}:role/eks-cluster-frontend-access-role

eksctl --profile frontend create iamidentitymapping \
--cluster=eks-cluster-frontend \
--arn ${ROLE_ARN} \
--group system:masters \
--username eks-cluster-frontend-access-role

Execute the following commands to edit the backend aws-auth ConfigMap:

kubectl config use-context $BACK_CXT

ACCOUNT_ID=$(aws --profile backend sts get-caller-identity | jq -r .Account)

ROLE_ARN=arn:aws:iam::${ACCOUNT_ID}:role/eks-cluster-backend-access-role

eksctl --profile backend create iamidentitymapping \
--cluster=eks-cluster-backend \
--arn ${ROLE_ARN} \
--group system:masters \
--username eks-cluster-backend-access-role

You can verify the updates by executing the following command:

kubectl describe cm -n kube-system aws-auth

Create the App Mesh service mesh and Cloud Map namespace:

Run the following helper script to:

  • install the appmesh-controller in each EKS cluster
  • create an AWS App Mesh service mesh (am-multi-account-mesh) in the main account
  • share the service mesh with the frontend and backend accounts
  • create an AWS Cloud Map namespace (am-multi-account.local) in the backend account
./helper-scripts/app_mesh_setup.sh

This helper script also creates a yelb namespace in each EKS cluster and labels it with the following annotations:

  • mesh=am-multi-account-mesh
  •  “appmesh.k8s.aws/sidecarInjectorWebhook”=enabled

These annotations allow the App Mesh sidecar proxy (Envoy) to inject automatically into pods that are created in the yelb namespace.

The AWS Cloud Map namespace in the backend account is used for service discovery between the yelb-ui virtual node that will be created in the frontend account, and the yelb-appserver virtual service that will be created in the backend account.

Deploy the SPIRE server:

Earlier, we modified the aws-auth ConfigMap to allow the SPIRE server in the main EKS cluster (eks-cluster-shared) to verify the identities of the SPIRE agents during node attestation. We need to create copies of the kubeconfig files for the frontend eks-cluster-frontend and backend eks-cluster-backend EKS clusters and make them available to the SPIRE server through ConfigMaps mounted as volumes. Executing the following helper script will expedite this process:

./helper-scripts/kubeconfig.sh

This script creates a spire namespace within the main EKS cluster and launches two new ConfigMaps in that namespace (front-kubeconfig and back-kubeconfig) that store copies of the kubeconfig data for the frontend and backend EKS clusters respectively. Since the SPIRE server will be conducting cross-account cluster authentication, the kubeconfig data also specifies the ARN of the corresponding cross-account IAM role (eks-cluster-frontend-access-role and eks-cluster-backend-access-role).

Next, install the SPIRE server using a helm chart:

helm install appmesh-spire-server ./appmesh-spire-server \
--namespace spire \
--set config.trustDomain=am-multi-account-mesh

This creates a StatefulSet in the spire namespace of the main EKS cluster and mounts the previously created ConfigMaps (front-kubeconfig and back-kubeconfig) as volumes. Note that the trust domain is set to the AWS App Mesh service mesh that was created in the main account and shared earlier (am-multi-account-mesh). The SPIRE server container image has been rebuilt to include the AWS CLI so that it can execute the eks get-token command for authentication with the frontend and backend EKS clusters. For more information, view the Dockerfile and visit the Amazon ECR Public Gallery listing.

Inspect the resources in the spire namespace to verify that the SPIRE server is up and running:

kubectl get all -n spire

Verify that the trust domain has been set properly:

kubectl describe configmap/spire-server -n spire | grep trust_domain

Verify that the kubeconfig ConfigMap volumes have been mounted properly:

kubectl exec -it spire-server-0 -n spire -- /bin/sh

cat /etc/kubeconfig/frontend/frontend.conf

cat /etc/kubeconfig/backend/backend.conf

exit

By inspecting the spire-server ConfigMap, you’ll see that the SPIRE server is configured to use the k8s_psat plugin for node attestation. The agent reads and provides the signed projected service account token (PSAT) to the server:

kubectl describe cm spire-server -n spire 

....

NodeAttestor "k8s_psat" {
    plugin_data {
      clusters = {
        "frontend-k8s-cluster" = {
          service_account_allow_list = ["spire:spire-agent-front"]
          kube_config_file = "/etc/kubeconfig/frontend/frontend.conf"
        },
        "backend-k8s-cluster" = {
          service_account_allow_list = ["spire:spire-agent-back"]
          kube_config_file = "/etc/kubeconfig/backend/backend.conf"
        }
      }
    }
  }
....

Before moving on to installing the SPIRE agents, make a copy of the spire-bundle ConfigMap, which contains the certificates necessary for the agents to verify the identity of the server when establishing a connection.

kubectl describe cm spire-bundle -n spire

kubectl get cm spire-bundle -n spire -o yaml > spire-bundle.yaml

Deploy the SPIRE agents:

Create the spire namespace in the frontend EKS cluster and then create a copy of the spire-bundle ConfigMap:

kubectl config use-context $FRONT_CXT

kubectl create ns spire

kubectl apply -f spire-bundle.yaml

kubectl describe cm spire-bundle -n spire

Next, install the spire agent using the provided Helm chart:

helm install appmesh-spire-agent ./appmesh-spire-agent \
--namespace spire \
--set serviceAccount.name=spire-agent-front \
--set config.clusterName=frontend-k8s-cluster \
--set config.trustDomain=am-multi-account-mesh \
--set config.serverAddress=$(kubectl get pod/spire-server-0 -n spire -o json \
--context $SHARED_CXT | jq -r '.status.podIP')

This creates a DaemonSet in the spire namespace of the frontend EKS cluster. It mounts the spire-bundle ConfigMap as a volume to be used for establishing a connection with the SPIRE server. The SPIRE agent is also using the k8s_psat plugin for node attestation. Note that the cluster name (frontend-k8s-cluster) is arbitrary. However, it must match the cluster name specified in the SPIRE server configuration for the k8s_psat plugin, as this same cluster name will be referenced during workload registration. The SPIRE server address is pulled from the pod (spire-server-0) running in the main EKS cluster.

To verify that the SPIRE agent is up and running, inspect the resources in the spire namespace:

kubectl get all -n spire

Repeat the same process for the backend EKS cluster:

kubectl config use-context $BACK_CXT

kubectl create ns spire

kubectl apply -f spire-bundle.yaml

kubectl describe cm spire-bundle -n spire

helm install appmesh-spire-agent appmesh-spire-agent \
--namespace spire \
--set serviceAccount.name=spire-agent-back \
--set config.clusterName=backend-k8s-cluster \
--set config.trustDomain=am-multi-account-mesh \
--set config.serverAddress=$(kubectl get pod/spire-server-0 -n spire -o json \
--context $SHARED_CXT | jq -r '.status.podIP')

kubectl get all -n spire

Register nodes and workloads with the SPIRE server:

At this point, you are ready to register node and workload entries with the SPIRE server:

kubectl config use-context $SHARED_CXT

./helper-scripts/register_server_entries.sh

Inspect the registered entries by executing the following command:

kubectl exec -n spire spire-server-0 -- /opt/spire/bin/spire-server entry show

You’ll notice that there are two entries associated with the SPIRE agent DaemonSets running in the frontend and backend EKS clusters:

...

Entry ID : a8335da4-a99b-4f28-acbf-cdc4ac916c66
SPIFFE ID : spiffe://am-multi-account-mesh/ns/spire/sa/spire-agent-back
Parent ID : spiffe://am-multi-account-mesh/spire/server
TTL : 3600
Selector : k8s_psat:agent_ns:spire
Selector : k8s_psat:agent_sa:spire-agent-back
Selector : k8s_psat:cluster:backend-k8s-cluster

Entry ID : 53b4a20c-9174-4f7b-a3c7-153643fb91b3
SPIFFE ID : spiffe://am-multi-account-mesh/ns/spire/sa/spire-agent-front
Parent ID : spiffe://am-multi-account-mesh/spire/server
TTL : 3600
Selector : k8s_psat:agent_ns:spire
Selector : k8s_psat:agent_sa:spire-agent-front
Selector : k8s_psat:cluster:frontend-k8s-cluster

...

The other entries for the frontend and backend workloads (the yelb-ui, yelb-appserver, yelb-db, and redis-server) reference the SPIFFE ID of the corresponding SPIRE agent as the value of their parent ID. The SPIRE server shares the list of registered entries with the SPIRE agents, who then use it to determine what SPIFFE Verifiable Identity Document (SVID) it needs to issue to a particular workload. This assumes there’s a match with the specified namespace, service account, pod label, and container name combination.

Note: an SVID is not a new type of public key certificate. It defines a standard in which an X.509 certificates are used. For more information, review the SVID specification.

Deploy the mesh resources and the Yelb application:

You can now deploy the AWS App Mesh virtual nodes and virtual services into the backend EKS cluster:


kubectl config use-context $BACK_CXT

kubectl apply -f mesh/yelb-redis.yaml 

kubectl apply -f mesh/yelb-db.yaml 

kubectl apply -f mesh/yelb-appserver.yaml

Using the yelb-appserver virtual node as an example, notice that it has a tls section defined for both its inbound listeners and its outbound backends:

...

listeners:
    - portMapping:
        port: 4567
        protocol: http
      tls:
        mode: STRICT
        certificate:
          sds:
            secretName: spiffe://am-multi-account-mesh/yelbapp
        validation:
          trust:
            sds:
              secretName: spiffe://am-multi-account-mesh
          subjectAlternativeNames:
            match:
              exact:
              - spiffe://am-multi-account-mesh/frontend          
...

backends:
    - virtualService:
       virtualServiceRef:
          name: yelb-db
    - virtualService:
       virtualServiceRef:
          name: redis-server
  backendDefaults:
    clientPolicy:
      tls:
        enforce: true
        mode: STRICT
        certificate:
          sds:
            secretName: spiffe://am-multi-account-mesh/yelbapp
        validation:
          trust:
            sds:
              secretName: spiffe://am-multi-account-mesh
          subjectAlternativeNames:
            match:
              exact:
              - spiffe://am-multi-account-mesh/yelbdb
              - spiffe://am-multi-account-mesh/redis
...

The certificate section under tls specifies the Envoy Secret Discovery Service (SDS) secret name. In this case, it is the SPIFFE ID that was assigned to the workload. The validation section includes the SPIFFE ID of the trust domain, which is the AWS App Mesh service mesh created earlier (am-multi-account-mesh), and a list of SPIFFE IDs associated with trusted services that are used as subject alternative name (SAN) matchers for verifying presented certificates.

Now, deploy the backend Kubernetes resources that the virtual nodes point to:

kubectl apply -f yelb/resources_backend.yaml

Before deploying the frontend virtual node and virtual service for the yelb-ui service, run the following helper script. It will retrieve the ARN of the yelb-appserver virtual service from the backend EKS cluster and create an updated version of the yelb-ui virtual node spec file (yelb-ui-final.yaml) containing that ARN as a reference.

kubectl config use-context $FRONT_CXT

./helper-scripts/replace_vs_arn.sh

Deploy the AWS App Mesh components and the Kubernetes resources for the yelb-ui frontend:

kubectl apply -f mesh/yelb-ui-final.yaml

kubectl apply -f yelb/resources_frontend.yaml

To test out the yelb-ui service, retrieve the load balancer DNS name and navigate to it in your browser:

kubectl get service yelb-ui -n yelb -o json \
| jq -r '.status.loadBalancer.ingress[].hostname'

You should see the following page load and be able to vote on the different restaurant options, verifying that the yelb-ui service is communicating with the yelb-appserver:

Successful Yelb page load

Verify the mTLS authentication:

After executing a few voting transactions via the yelb-ui, you can move on to validating the mTLS authentication that takes place between each of the Envoy proxies for the underlying services. For this, we will query the administrative interface that Envoy exposes.

  1. Start by switching to the frontend context and setting an environment variable to hold the name of the yelb_ui pod:
kubectl config use-context $FRONT_CXT

FRONT_POD=$(kubectl get pod -l "app=yelb-ui" -n yelb \
--output=jsonpath={.items..metadata.name})

echo $FRONT_POD
  1. Check that the Secret Discovery Service (SDS) is active and healthy:
kubectl exec -it $FRONT_POD -n yelb -c envoy \
-- curl http://localhost:9901/clusters | grep -E \
'(static_cluster_sds.*cx_active|static_cluster_sds.*healthy)'

You should see one active connection, indicating that the SPIRE agent is correctly configured as an SDS provider for the Envoy proxy, along with a healthy status.

static_cluster_sds_unix_socket::/run/spire/sockets/agent.sock::cx_active::1
static_cluster_sds_unix_socket::/run/spire/sockets/agent.sock::health_flags::healthy
  1. Next, check the loaded TLS certificate:
kubectl exec -it $FRONT_POD -n yelb -c envoy \
-- curl http://localhost:9901/certs

This certificate is the X509-SVID issued to the yelb-ui service. You should see two SPIFFE IDs listed, that of the trust domain in the ca_cert section, and that of the yelb-ui service listed in the cert_chain section.

Note: App Mesh doesn’t store the certificates or private keys that are used for mutual TLS authentication. Instead, Envoy stores them in memory.

  1. Check the SSL handshakes:
kubectl exec -it $FRONT_POD -n yelb -c envoy \
-- curl http://localhost:9901/stats | grep ssl.handshake

An SSL handshake is executed between the yelb-ui and the yelb-appserver (via the Envoy proxies) for every API request triggered. For example, when the Yelb webpage is loaded, two GET requests (/getvotes and /getstats) trigger the execution of two corresponding SSL handshakes.

You can repeat the same process using the backend context to examine mTLS authentication for the other services. For example, you can check the SSL handshakes for the yelb-appserver:

kubectl config use-context $BACK_CXT

BE_POD_APP=$(kubectl get pod -l "app=yelb-appserver" -n yelb \
--output=jsonpath={.items..metadata.name})

kubectl exec -it $BE_POD_APP -n yelb -c envoy \
-- curl http://localhost:9901/stats | grep ssl.handshake

In this case, you’ll notice that additional SSL handshakes are executed with the yelb-db and yelb-redis services.

cluster.cds_egress_am-multi-account-mesh_redis-server_yelb_tcp_6379.ssl.handshake: 3
cluster.cds_egress_am-multi-account-mesh_yelb-db_yelb_tcp_5432.ssl.handshake: 58
listener.0.0.0.0_15000.ssl.handshake: 13

Cleaning up:

Run the following helper script to delete:

  • all resources in the yelb and spire namespaces
  • the Cloud Map am-multi-account.local namespace
  • the App Mesh am-multi-account-mesh service mesh
  • the appmesh-controller
  • the appmesh-system namespace
./helper-scripts/cleanup.sh

Delete the CloudFormation stacks, starting with the frontend account:

aws --profile frontend cloudformation delete-stack \
--stack-name eks-cluster-frontend-stack

aws --profile frontend cloudformation wait stack-delete-complete \
--stack-name eks-cluster-frontend-stack

Delete the backend CloudFormation stack:

aws --profile backend cloudformation delete-stack \
--stack-name eks-cluster-backend-stack

aws --profile backend cloudformation wait stack-delete-complete \
--stack-name eks-cluster-backend-stack

Finally, delete the shared services CloudFormation stack:

aws --profile shared cloudformation delete-stack \
--stack-name eks-cluster-shared-services-stack

Conclusion

In this post, we created three Amazon EKS clusters, each in its own AWS account and VPC. We then established a network connection between the VPCs using AWS Transit Gateway. We installed a SPIRE server in one EKS cluster, a SPIRE agent and a frontend service in a second cluster, and another SPIRE agent and backend resources in a third cluster. We used AWS App Mesh to create a service mesh spanning across these three EKS clusters to facilitate service-to-service communication. We then established mutual TLS authentication between the services using Envoy’s Secret Discovery Service (SDS) as implemented by the SPIFFE Runtime Environment (SPIRE).

With this approach, customers that rely on multi-account, multi-cluster strategies can take advantage of the integration between AWS App Mesh and SPIRE to enable mTLS authentication across their segmented environments, moving them a step forward on the path to a zero-trust architecture.

To learn more, we recommend you review these additional resources: