OPA Gatekeeper + EKS

Overview

 

Open Policy Agent (OPA) is recommended in the EKS Best Practices Guide for Security to detect policy violations before deployment. OPA has also been introduced in Using Open Policy Agent on Amazon EKS .

This article is one of our best practices series for EKS. I will explain what OPA Gatekeeper and Kubernetes policies are, and how they can help us manage and secure the EKS cluster.

Why OPA Gatekeeper?

Kubernetes policies define what end-users can do on the cluster to ensure that clusters are in compliance with organization policies, to enforce best practices or to meet legal requirements.

But there is no simple security or policy configuration for Kubernetes. We have to implement policies in many ways, e.g.some important policies are:

1. Role Based Access Control Policies:

RBAC allows cluster admins to configure and control access to Kubernetes resources as well as the operations.

2. Resource Consumption Policies:

Resource Quotas are defined by ResourceQuota objects to provide constraints that limit aggregate resource consumption per namespace.

Limit Ranges the policies to constrain resource allocations to Pods or Containers in a namespace.

3. Pod Security Policies:

PodSecurityPolicy objects define a set of conditions that a pod must run with in order to be accepted into the system.

Due to the lack of single point of policies solution, ensuring compliance manually can be error-prone. K8s.af lists some public failure stories related to K8s.

What is OPA Gatekeeper?

OPA is open source policy engine that’s part of CNCF. It’s used for making policy decisions and can be run a variety of different ways, e.g. as a language library or a service. OPA policies are written in a Domain Specific Language (DSL) called Rego. While it is often run as part of a Kubernetes Dynamic Admission Controller

OPA Gatekeeper enforces policies and strengthen governance on a Kubernetes cluster. It introduces the following functionality:

Open Policy Agent Gatekeeper Components / Source — Open Policy Agent Gatekeeper Documentation
Open Policy Agent Gatekeeper Components / Source — Open Policy Agent Gatekeeper Documentation

 

Kubernetes provides Admission controller webhooks (HTTP Callbacks) to intercept admission requests before they are persisted as objects in Kubernetes, OPA Gatekeeper uses the same for making policy decisions from the API Server. Once all object modifications are complete, and incoming object is validated by the API server, validating admission webhooks are invoked and they can either reject or accept requests to enforce policies.

Deploying OPA Gatekeeper on EKS running Istio service mesh

We will create the EKS cluster using eksctl deployment file.

Next, we will install Istio by following the installation guides.

And then, we install OPA Gatekeeper using Helm v2 by following the instructions. We will use kas a shorthand for kubectl.

k apply --namespace gatekeeper-system -f manifests/gatekeeper.yaml

We can now see Gatekeeper pods are running in the gatekeeper-system namespace.

k get pods -n gatekeeper-system

Use Cases: we have a corporate policy that mTLS must be on in a particular namespace for all services.

In Istio, there are three levels of granularity through which we can define our mTLS settings. For each service, Istio applies the narrowest matching policy. The order is: service-specific, namespace-wide, mesh-wide.

Let’s get to some hands-on. We will create Kubernetes CustomResourceDefinitions (CRDs) ConstraintTemplate and Constraint.

First, let’s go through the example of ConstraintTemplate .yaml file:

apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: policystrictonly
spec:
  crd:
    spec:
      names:
        kind: PolicyStrictOnly
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package istio.policystrictonly
        # VIOLATION spec.peers does not exist
        violation[{"msg": msg}] {          
          p := input.review.object
          speckeys := { k | p.spec[k]}
          not speckeys["peers"]
          
          msg := sprintf("%v %v.%v spec.peers does not exist", 
            [p.kind, p.metadata.name, p.metadata.namespace])
        } 
        # VIOLATION spec.peers is []
        violation[{"msg": msg}] {
          p := input.review.object          
          k := "peers"
          p.spec[k] == []
          
          msg := sprintf("%v %v.%v spec.peers cannot be empty", 
            [p.kind, p.metadata.name, p.metadata.namespace])
        } 
        # VIOLATION peer authentication is set to permissive
        violation[{"msg": msg}] {
          p := input.review.object
          kp := "peers"
          km := "mode"
          
          peermethod := p.spec[kp][_]
          peermethod[km] != "STRICT"
          
          msg := sprintf("%v %v.%v spec.peers must include [{mtls: {}, mode: STRICT}]", 
            [p.kind, p.metadata.name, p.metadata.namespace])
        }

and Constraint .yaml file:

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: PolicyStrictOnly
metadata:
  name: policy-strict-constraint
spec:
  enforcementAction: deny
  match:
    kinds:
      - apiGroups: ["authentication.istio.io"]
        kinds: ["Policy"]
    namespaces: ["default"]

And then, we deploy ConstraintTemplate and Constraint to the cluster.

k apply -f templates/policy-strict-template.yaml
k apply -f constraints/policy-strict-constraint.yaml
 

Let’s take a loot at the sample bad-app-policy object below. It is a service-specific policy that has higher precedence than the mesh-wide policy, so it overrides any previously defined mTLSMeshPolicy , which violates the corporate policy.

apiVersion: authentication.istio.io/v1alpha1
kind: Policy
metadata:
  name: app-policy-demo
spec:
  targets:
  - name: app-demo
  peers:
  - mtls: {}
    mode: PERMISSIVE

Let’s test the Constraint by deploying a bad-app-policy object that will define service-specific mTLS policy toPERMISSIVE .

k apply -f sample/bad-app-policy.yaml

and we will get a bunch of Error messages which denied the request.

Error from server ([denied by policy-strict-constraint] Policy app-policy-demo.default spec.peers must include [{mtls: {}, mode: STRICT}]): error when creating "sample/bad-app-policy.yaml": admission webhook "validation.gatekeeper.sh" denied the request: [denied by policy-strict-constraint] Policy app-policy-demo.default spec.peers must include [{mtls: {}, mode: STRICT}]

Audit: another great feature of OPA Gatekeeper is audit functionality, it enables periodic evaluations of replicated resources against the policies enforced in the cluster to detect pre-existing misconfigurations.

Audit results are stored as violations listed in the status field of the failed constraint.

k describe policystrictonly.constraints.gatekeeper.sh policy-strict-constraint

And, personally I found it very helpful to view the cluster overall constraints and violations status using GPM web interface.

Congrats ! We have successfully enforced policies with OPA Gatekeeper in the EKS cluster. Thank you for reading !

~