Container Vulnerabilities Scans with Azure Pipelines

Overview

There are no doubts that containers are playing a huge part in migrating applications to public clouds across all industries and organisation sizes. According to Research And Markets, the global application container market is forecasted to grow at a CAGR of 29% from 2021 to 2026 with the most preferred public cloud container hosting services being AWS, Microsoft Azure and Google.

The container technologies encapsulate applications and their dependencies inside easily manageable units that can be orchestrated across different cloud environments in a homogenous manner through tools such as Kubernetes and Spinnaker. This is important for public cloud adoption in enterprises, especially for those in highly regulated industries where they usually need to adopt a multi-cloud strategy. Containers help these enterprises to level the environmental heterogeneity of different cloud platforms.

However, the ease to share and use container images makes vulnerabilities spread like never before. One strategy hackers have used for years is to share Docker images with malicious code in public repositories such as Docker Hub. The malicious code could leave the un-secured containers or even the whole Kubernetes cluster compromised. To defend against these attacks, static vulnerability analysis of container images often acts as the first line of defences.

There are typically two models of vulnerability scanning for container images: the centralised model and the standalone model.

The centralised model is usually adopted by enterprises who have a large cloud or application modernisation programme. For example, this could be a shared service deployed by Cloud Center of Excellence to enforce common security standards, tooling and automations to hundreds or even thousands of containers that are deployed on their secured public cloud platforms.

 
An example of an image vulnerability scanning service that Airwalk helps to deliver for a banking client
An example of an image vulnerability scanning service that Airwalk helps to deliver for a banking client

The standalone model is more suitable for smaller teams whose deployment scales as well as resources and time constraints do not justify the adoption of a centralised scanning service. This model is usually integrated within the CI/CD pipelines for deploying container images (Yes, you should ALWAYS build production images using a pipeline instead of manually using Docker CLI). The scan is performed after the image is built and before it is pushed to a repository for deployment.

My recent client is deploying one of their applications to Azure Kubernetes Service but the cybersecurity team is requesting to review the vulnerabilities reports of the application containers before the service can go-live. Due to the urgency of the matter, the standalone image scans are more fitting to this use case. The scanning mechanisms, first with Clair and later with Trivy, are introduced to the existing image build and deployment pipelines which are implemented with Azure Pipelines.

The first iteration of the image scans is implemented with CoreOS Clair (later acquired by RedHat). Clair is an open source scanner with an API-driven analysis engine performing image scans.

Clair is originally not designed to be used within CI/CD pipelines so it needs several workarounds in order to make it work within Azure Pipelines. First of all, Clair offers APIs only with no native CLI client support which makes scripting within pipelines difficult. We also want the scanning tools to run during the pipeline executions only and therefore saving the resources required to maintain a scanning platform. The issue with this arrangement is that starting Clair from scratch within a pipeline takes about 20 to 30 minutes because the database needs to be filled up with CVEs.

Fortunately, Arminc solves both problems by offering a Clair CLI client named clair-scanner and a “local” version of Clair named clair-local-scan. The bread and butter of the solution is the pre-populated Clair database that is updated and published as docker images on daily basis. The pre-populated database image can be run within pipelines and can start Clair without waiting for the CVEs to be pulled. The following code snippets provide sample Azure Pipeline tasks to perform the Clair scan.

variables:
  image_name: openjdk
  image_tag: 17-jdk-slim
  clair_db: arminc/clair-db:latest
  clair_scan: arminc/clair-local-scan:v2.1.7_5125fde67edee46cb058a3feee7164af9645e07d

jobs:
- job: ClairScanContainerImage
displayName: Scan container image by ClairV2

  steps:
  - task: Docker@2
    displayName: 'Build an image'
    inputs:
      command: build
      repository: $(image_name)
      tags: $(image_tag)
      dockerFile: '**/Dockerfile'

  - script: |
      mkdir report
      docker run -d --name clair-db $(clair_db)
      docker run -p 6060:6060 --link clair-db:postgres -d --name clair $(clair_scan)
      MY_IP=$(ifconfig eth0 | grep -Eo 'inet (addr:)?([0-9]*\.){3}[0-9]*' | grep -Eo '([0-9]*\.){3}[0-9]*' | grep -v '127.0.0.1')
      sleep 10
      CLAIR_IP=$(docker inspect -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}' clair)
      clair-scanner --ip $MY_IP --clair http://$CLAIR_IP:6060 -t High --reportAll=false $(image_name):$(image_tag) | tee ./report/clair-image-scan-report.txt
      docker stop clair
      docker rm clair
      docker stop clair-db
      docker rm clair-db
      docker image rm $(clair_db)
    displayName: "Image scan by Clair"
    continueOnError: true

For demo purposes, the Dockerfile simply just pulls the openjdk:17-sdk-slim image from Docker Hub without doing anything else so the above is equivalent to scanning the openjdk:17-sdk-slim image with Clair. The Azure Pipeline runs both the pre-populated Clair DB and Clair server containers on the same host that is running the Self-hosted agent. The clair-scanner CLI is already built into the Self-hosted agent image so it can be directly used without additional installations during runtime.

 
Clair Scan Vulnerability Report within Azure Pipeline
Clair Scan Vulnerability Report within Azure Pipeline

The solution works but not without issues. The daily scheduled job publishing the pre-populated Clair DB image failed from time to time so images could be scanned without the most updated CVEs. Most importantly, the solution is based on Clair V2, which is already deprecated by RedHat and only maintained by the GitHub community on best effort basis. The newest version of Clair is terrible in identifying vulnerabilities and corresponding severities for Debian based images. It is clear that an alternative tooling is required as a long term solution.

Trivy is another open source vulnerability scanner developed by Teppei Fukuda, and recently acquired by Aqua Security. In contrast to Clair, Trivy has a standalone mode designed to integrate with any DevSecOps pipelines. In standalone mode, Trivy acts like a simple CLI that automatically fetches the required CVEs DB on the fly during the scans.

 
Trivy in Standalone Mode
Trivy in Standalone Mode

Since Trivy can behave like a command line, the Trivy scans task within the Azure Pipelines is much more straight forward without the docker and bash scripts maneuvers required by Clair.

variables:
  image_name: openjdk
  image_tag: 17-jdk-slim

jobs:

- job: TrivyScanContainerImage
  displayName: Scan container image by Trivy
  steps:
  - task: Docker@2
    displayName: 'Build an image'
    inputs:
      command: build
      repository: $(image_name)
      tags: $(image_tag)
      dockerFile: '**/Dockerfile'

  - script: |
    mkdir report
    trivy image -s HIGH,CRITICAL $(image_name):$(image_tag) | tee ./report/trivy-image-scan-report.txt
  displayName: "Image scan by Trivy"
  continueOnError: true

Trivy solves the two problems associated with Clair in the solution mentioned above. It downloads its trivy-db from GitHub during executions in real-time to ensure images are always scanned with the most updated CVEs. It is also actively developed and maintained by Aqua Security, and the vendor has plans to retire it existing product Microscanner eventually, in favour of Trivy.

 
Trivy Scan Vulnerability Report within Azure Pipeline
Trivy Scan Vulnerability Report within Azure Pipeline

Trivy also has a useful feature where it has a--exit-code 1 flag to force the command to return exit code 1 if there are any target vulnerabilities found during the scans. The exit code will therefore purposely fail the pipeline and prevent un-secured images from being further processed or deployed.

Overall, Trivy is a more suitable scanning tool for Azure Pipelines (or for any CI/CD pipelines as a matter of fact) due to its better integrations with scripting in its standalone mode.

One important aspect for picking the right image vulnerabilities scanner is the accuracies of the findings generated by the tools. We can briefly compare the reports between Clair and Trivy using our example, which is based on the scanning tasks of the openjdk:17-sdk-slim image. For simplicity, we only consider findings with severities of High or Critical as demonstrated by the screenshots of the Clair and Trivy reports above. Clair can only identify 1 High vulnerability in the image while Trivy can identify 2 Critical and 21 High level vulnerabilities. This might be partly due to the fact that Trivy can detect vulnerabilities in applications dependencies in addition to the OS ones.

Another interesting observation is that the only CVE in the Clair report (CVE-2019–25013) does not appear anywhere in the Trivy report. Further investigations have shown that Trivy actually classifies the same CVE with Medium severity instead of High. If we check the NVD entry for this vulnerability, its base score is 5.9 Medium at the moment, which is more aligned with the findings from Trivy.

CVE-2019–25013 as reported by Trivy
CVE-2019–25013 as reported by Trivy

In our example targeting openjdk:17-sdk-slim, Trivy outperforms Clair (at least the version 2 of Clair) not only in the number of vulnerabilities detected but also in the accuracies of the CVE details in the reports.

In this article, we have discussed the different patterns of securing containers through static vulnerability analysis. One way is to build a scanning platform with automations to guard un-secure images from being published. The other way is to incorporate scanning into DevSecOps pipelines before pushing images to the repositories.

The second part of the article provides a deep dive of implementing image scans in Azure Pipelines using Clair V2 and Trivy with a brief comparison of both tools’ capabilities. The Azure Pipelines YAML files used in the example for both Clair and Trivy scans can be accessed on GitHub here.

With the container ecosystem ever expanding now and in the foreseeable future, security must be considered at every level of the container deployment process from the base images, the applications that are built into the images, the platforms running the containers to the ingresses accessing the containers. Static vulnerability scanning is just one component of many security gates that are required to ensure a secured container environment.

Airwalk Reply has been helping many clients such as major financial and public institutions to establish secured container platforms on public clouds including AWS and Azure with the capabilities to scale to thousands of pods or containers. If you need help in this space, get in touch with us!