2020-06-08 00:00:00
Kubernetes is a very popular and powerful tool harnessing containers to run jobs. These containers are normally unique to a specific application and are created and stored ‘in-house’ by organisations, so making sure that the correct container is used to run the correct job is always important.
After all what would happen if the container used to run a job introduced issues or was compromised?
During a recent client engagement to deliver a strategic digital container platform on a global scale, this requirement was surfaced – how do we identify, report and manage container-level vulnerabilities within the applications deployed on the new platform?
Being the first platform of its type in the customer organisation, no guidelines or patterns existed for the use of application containers, meaning the required solution would also form the basis for the enablement and development of the organisation’s wider AWS container security patterns. Due to the nature of the customer (highly regulated), any solution developed also included the requirement to provide segregation of platform roles and responsibilities, including the enforcement of controls around deployment, promotion and acceptance of images.
Aside from the technical challenges to delivering such a tool, one of its main goals would also be to enable the wider adoption and accelerate migration on to the new strategic container platform through developer focused workflows and early development lifecycle feedback, increasing developer productivity, application quality and the security of the platform as whole.
Working with the client’s security, compliance and development teams, we designed, developed and implemented an AWS based, cloud-first solution, known affectionately as ‘Maria’.
The solution operates as a centralised container storage and scanning solution, extending the standard AWS ECR service to provide a multi-tenanted, application-agnostic platform.
Key features of the solution include:
From a consumer perspective, the solution offers the following interactions:
We also needed to develop and codify the AWS ECR security patterns to ensure the solution was compliant from the outset, and that the standards and processes defined for the centralised solution were also implemented consistently across the global developer community.
The solution operates from a central AWS ‘solution’ account that application teams interact with. Within the account, workflow services are provided such that pushing an image automatically triggers the required actions. Segregation of images is provided so that consumers may only push to “unscanned” repositories, with the system automatically promoting any approved images to “scanned” repositories, which can only ever be read from.

Workflow orchestration is provided through AWS Lambda functions, in combination with CloudWatch Event Rules, SNS and SQS for message queueing.
Scanning services are provided through the deployment of Aqua Cloud Native Security Platform (CSP). This tool is deployed within the solution account inside a private EKS cluster. The dashboard interface provided by Aqua CSP allows development teams to search, report and export the state of the recorded image scans and for security teams to manage vulnerability policies.
In the event of an image push, the scanning workflows orchestrate ad-hoc Kubernetes jobs within the EKS cluster, scanning the image within an ephemeral scan container.
These scan jobs are monitored by the workflow processes, which further orchestrate ephemeral image promotion jobs to move the image from “unscanned” to “scanned” in the event of success or reject the image in the event of a scan failure.
Throughout the process, SNS event notifications are generated at key events, notifying the consuming team and pipeline processes of events within the system, including scan acceptance, completion, promotion or rejection.
Results from the scans are written to HTML reports and persisted to S3 objects which are shared with the originating consumers, as well as being made available in the Aqua CSP dashboard for security and business users to investigate. All events, actions, scan results and responses are also logged to DynamoDB audit tables which are used for integration with pre-existing customer security tools.
