AWS Compute Blog
Scanning Docker Images for Vulnerabilities using Clair, Amazon ECS, ECR, and AWS CodePipeline
Post by Vikrama Adethyaa, Solution Architect and Tiffany Jernigan, Developer Advocate
Update – July 26, 2021 – While this post remains accurate, we want to make it clear that we did announce built in image scanning in Amazon ECR in October 2019. The built in integration provides a simpler solution than what is in this post.
Containers are an increasingly important way for you to package and deploy your applications. They are lightweight and provide a consistent, portable software environment for applications to easily run and scale anywhere.
A container is launched from a container image, an executable package that includes everything needed to run an application: the application code, configuration files, runtime (for example, Java, Python, etc.), libraries, and environment variables.
A container image is built up from a series of layers. For a Docker image, each layer in the image represents an instruction in the image’s Dockerfile. A parent image is the image on which your image is built. It refers to the contents of the FROM directive in the Dockerfile. Most Dockerfiles start from a parent image, and often the parent image was downloaded from a public registry.
It is incredibly difficult and time-consuming to manually track all the files, packages, libraries, and so on, included in an image along with the vulnerabilities that they may possess. Having a security breach is one of the costliest things an organization can endure. It takes years to build up a reputation and only seconds to tear it down.
One way to prevent breaches is to regularly scan your images and compare the dependencies to a known list of common vulnerabilities and exposures (CVEs). Public CVE lists contain an identification number, description, and at least one public reference for known cybersecurity vulnerabilities. The automatic detection of vulnerabilities helps increase awareness and best security practices across developer and operations teams. It encourages action to patch and address the vulnerabilities.
This post walks you through the process of setting up an automated vulnerability scanning pipeline. You use AWS CodePipeline to scan your container images for known security vulnerabilities and deploy the container only if the vulnerabilities are within the defined threshold.
This solution uses CoresOS Clair for static analysis of vulnerabilities in container images. Clair is an API-driven analysis engine that inspects containers layer-by-layer for known security flaws. Clair scans each container layer and provides a notification of vulnerabilities that may be a threat, based on the CVE database and similar data feeds from Red Hat, Ubuntu, and Debian.
Deploying Clair
Here’s how to install Clair on AWS. The following diagram shows the high-level architecture of Clair.
Clair uses PostgreSQL, so use Aurora PostgreSQL to host the Clair database. You deploy Clair as an ECS service with the Fargate launch type behind an Application Load Balancer. The Clair container is deployed in a private subnet behind the Application Load Balancer that is hosted in the public subnets. The private subnets must have a route to the internet using the NAT gateway, as Clair fetches the latest vulnerability information from multiple online sources.
Prerequisites
Ensure that the following are installed or configured on your workstation before you deploy Clair:
- Docker
- Git
- AWS CLI installed
- AWS CLI is configured with your access key ID and secret access key, and the default region as us-east-1
Download the AWS CloudFormation template for deploying Clair
To help you quickly deploy Clair on AWS and set up CodePipeline with automatic vulnerability detection, use AWS CloudFormation templates that can be downloaded from the aws-codepipeline-docker-vulnerability-scan GitHub repository. The repository also includes a simple, containerized NGINX website for testing your pipeline.
VPC requirements
We recommend a VPC with the following specification for deploying CoreOS Clair:
- Two public subnets
- Two private subnets
- NAT gateways to allow internet access for services in private subnets
You can create such a VPC using the AWS CloudFormation template networking-template.yaml that is included in the sample code you cloned from GitHub.
Build the Clair Docker image
First, create an Amazon Elastic Container Registry (Amazon ECR) repository to host your Clair Docker image. Then, build the Clair Docker image on your workstation and push it to the ECR repository that you created.
Deploy Clair using AWS CloudFormation
Now that the Clair Docker image has been built and pushed to ECR, deploy Clair as an ECS service with the Fargate launch type. The following AWS CloudFormation stack creates an ECS cluster named clair-demo-cluster
and deploys the Clair service.
Deploying the sample website
Deploy a simple static website running on NGINX as a container. An AWS CloudFormation template is included in the sample code that you cloned from GitHub.
Create a CodeCommit repository for the NGINX website
You create an AWS CodeCommit repository to host the sample NGINX website code. This repository is the source of the pipeline that you create later. Before you proceed with the following steps, ensure SSH authentication to CodeCommit.
Build the NGINX Docker image
Create an ECR repository to host your NGINX website Docker image. Build the image on your workstation using the file Dockerfile-amznlinux, where Amazon Linux is the parent image. After the image is built, push it to the ECR repository that you created.
Deploy the NGINX website using AWS CloudFormation
Now deploy the NGINX website. The following stack deploys the NGINX website onto the same ECS cluster (clair-demo-cluster) as Clair.
Note the AWS CloudFormation stack outputs. The stack output contains the Application Load Balancer URL for the NGINX website and the ECS service name of the NGINX website. You need the ECS service name for the pipeline.
Building the pipeline
In this section, you build a pipeline to automate vulnerability scanning for the nginx-website Docker image builds. Every time that a code change is made, the Docker image is rebuilt and scanned for vulnerabilities. Only if vulnerabilities are within the defined threshold is the container is deployed onto ECS. For more information, see Tutorial: Continuous Deployment with AWS CodePipeline.
The sample code includes an AWS CloudFormation template to create the pipeline. The buildspec.yml file is used by AWS CodeBuild to build the nginx-website Docker image and scan the image using Clair.
CodeBuild build spec
A build spec is a collection of build commands and related settings, in YAML format, that AWS CodeBuild uses to run a build. You can include a build spec in the root directory of your application source code, or you can define a build spec when you create a build project.
In this sample app, you include the build spec in the root directory of your sample application source code. The buildspec.yml file is located in the /aws-codepipeline-docker-vulnerability-scan/nginx-website folder.
Use Klar, a simple tool to analyze images stored in a private or public Docker registry for security vulnerabilities using Clair. Klar serves as a client which coordinates the image checks between ECR and Clair.
In the buildspec.yml file, you set the variable CLAIR_OUTPUT=Critical
. CLAIR_OUTPUT
defines the severity level threshold. Vulnerabilities with severity levels higher than or equal to this threshold are outputted. The supported levels are:
- Unknown
- Negligible
- Low
- Medium
- High
- Critical
- Defcon1
You can configure Klar to your requirements by setting the variables as defined in https://github.com/optiopay/klar.
The build spec does the following:
Pre-build stage:
- Log in to ECR.
- Download the Clair client Klar.
Build stage:
- Build the Docker image and tag it as latest and with the Git commit ID.
Post-build stage:
- Push the image to your ECR repository with both tags.
- Trigger Klar to scan the image that you pushed to ECR for security vulnerabilities using Clair.
- Write a file called imagedefinitions.json in the build root that has your Amazon ECS service’s container name and the image and tag. The deployment stage of your CD pipeline uses this information to create a new revision of your service’s task definition. It then updates the service to use the new task definition. The imagedefinitions.json file is required for the AWS CodeDeploy ECS job worker.
Deploy the pipeline
Deploy the pipeline using the AWS CloudFormation template provided with the sample code. The following template creates the CodeBuild project, CodePipeline pipeline, Amazon CloudWatch Events rule, and necessary IAM permissions.
The pipeline is triggered after the AWS CloudFormation stack creation is complete. You can log in to the AWS Management Console to monitor the status of the pipeline. The vulnerability scan information is available in CloudWatch Logs.
You can also modify the CLAIR_OUTPUT
value from Critical
to High
in the buildspec.yml file in the /cores-clair-ecs-cicd/nginx-website-repo folder and then check the status of the build.
Summary
I’ve described how to deploy Clair on AWS and set up a release pipeline for the automated vulnerability scanning of container images. The Clair instance can be used as a centralized Docker image vulnerability scanner and used by other CodeBuild projects. To meet your organization’s security requirements, define your vulnerability threshold in Klar by setting the variables, as defined in https://github.com/optiopay/klar.
- CoreOS Clair GitHub repository – https://github.com/coreos/clair
- Klar GitHub repository – https://github.com/optiopay/klar