Mark Young

Software. Microcontrollers. Beer.

Playing Around With Gatekeeper V3 in K8S

Enforce allow/deny Policies in kubernetes at a base controller level

I was at kubecon the in mid-November with an old roommate (in the same industry as me) and some coworkers but got to see some cool kubernetes stuff.

Since my job crosses the Security boundary, and I’m fairly new to k8s, most of my talks were somewhere on the basic level or opa level. With OPA what you get is a lightweight language to write policies your kubernetes cluster should adhere to.

Typically (I think) this is traditionally run by building your policies into a sidecar and forcing it to be an admission controller.

Gatekeeper simplifies this by removing the side-car for native CRD’s and adds a few things like audit logs.

I’m going to build up an example of this using EKS.

I assume we have a working EKS cluster:

1
2
3
$ aws-vault exec home -- kubectl get all
NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
service/kubernetes   ClusterIP   172.20.0.1   <none>        443/TCP   6h15m

Before we deploy anything, let’s set up GateKeeper v3:

1
$ aws-vault exec home -- kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper/master/deploy/gatekeeper.yaml

Gatekeeper should now be running (albeit with no policies):

1
2
3
4
5
6
7
8
9
$ aws-vault exec home -- kubectl get all -n gatekeeper-system
NAME                                  READY   STATUS    RESTARTS   AGE
pod/gatekeeper-controller-manager-0   1/1     Running   1          144m

NAME                                            TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
service/gatekeeper-controller-manager-service   ClusterIP   172.20.41.90   <none>        443/TCP   144m

NAME                                             READY   AGE
statefulset.apps/gatekeeper-controller-manager   1/1     144m

Next, before we move on, I’m going to set up policykit so that we can use real REGO (the language for OPA) instead of rego embedded inside yaml (allows policy testing etc later on):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
$ pip install policykit
$ cat <<EOF >k8sallowedrepos.rego
package k8sallowedrepos

violation[{"msg": msg}] {
  container := input.review.object.spec.containers[_]
  satisfied := [good | repo = input.parameters.repos[_] ; good = startswith(container.image, repo)]
  not any(satisfied)
  msg := sprintf("container <%v> has an invalid image repo <%v>, allowed repos are %v", [container.name, container.image, input.parameters.repos])
}

violation[{"msg": msg}] {
  container := input.review.object.spec.initContainers[_]
  satisfied := [good | repo = input.parameters.repos[_] ; good = startswith(container.image, repo)]
  not any(satisfied)
  msg := sprintf("container <%v> has an invalid image repo <%v>, allowed repos are %v", [container.name, container.image, input.parameters.repos])
}
EOF

What we have now is a rego file that will validate base images (parameterized). Let’s write some tests.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
$ cat <<EOF >k8sallowedrepos_test.rego
package k8sallowedrepos

test_image_safety_positive {
    count(violation) == 1 with input.parameters.repos as ["hooli.com/"]
        with input.review.object.spec.containers as [
            {"name": "ok", "image": "hooli.com/web"},
            {"name": "bad", "image": "badrepo.com/web"},
        ]
}

test_image_safety_negative {
    count(violation) == 0 with input.parameters.repos as ["hooli.com/"]
        with input.review.object.spec.containers as [
            {"name": "ok", "image": "hooli.com/web"},
        ]
}

test_image_safety_init_container_positive {
    count(violation) == 1 with input.parameters.repos as ["hooli.com/"]
        with input.review.object.spec.initContainers as [
            {"name": "ok", "image": "hooli.com/web"},
            {"name": "bad", "image": "badrepo.com/web"},
        ]
}

test_image_safety_init_container_negative {
    count(violation) == 0 with input.parameters.repos as ["hooli.com/"]
        with input.review.object.spec.initContainers as [
            {"name": "ok", "image": "hooli.com/web"},
        ]
}
EOF

$ curl -Ls https://github.com/open-policy-agent/opa/releases/download/v0.15.1/opa_linux_amd64 -o opa && chmod +x opa
$ opa test . -v
data.k8sallowedrepos.test_image_safety_positive: PASS (622.8µs)
data.k8sallowedrepos.test_image_safety_negative: PASS (376µs)
data.k8sallowedrepos.test_image_safety_init_container_positive: PASS (550.6µs)
data.k8sallowedrepos.test_image_safety_init_container_negative: PASS (389.3µs)
--------------------------------------------------------------------------------
PASS: 4/4

Now let’s generate our ConstraintTemplate (the basic template that accepts parameters):

1
2
3
$ pk build *.rego
[k8sallowedrepos] Generating a ConstraintTemplate from "k8sallowedrepos.rego"
[k8sallowedrepos] Saving to "k8sallowedrepos.yaml"

Let’s see what that ConstraintTemplate looks like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
$ cat k8sallowedrepos.yaml
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: k8sallowedrepos
spec:
  crd:
    spec:
      names:
        kind: k8sallowedrepos
        listKind: k8sallowedreposList
        plural: k8sallowedrepos
        singular: k8sallowedrepo
  targets:
  - rego: |
      package k8sallowedrepos

      violation[{"msg": msg}] {
        container := input.review.object.spec.containers[_]
        satisfied := [good | repo = input.parameters.repos[_] ; good = startswith(container.image, repo)]
        not any(satisfied)
        msg := sprintf("container <%v> has an invalid image repo <%v>, allowed repos are %v", [container.name, container.image, input.parameters.repos])
      }

      violation[{"msg": msg}] {
        container := input.review.object.spec.initContainers[_]
        satisfied := [good | repo = input.parameters.repos[_] ; good = startswith(container.image, repo)]
        not any(satisfied)
        msg := sprintf("container <%v> has an invalid image repo <%v>, allowed repos are %v", [container.name, container.image, input.parameters.repos])
      }
    target: admission.k8s.gatekeeper.sh

As you can see that rego is now embedded inside a ConstraintTemplate yaml. Let’s apply it:

1
2
$ aws-vault exec home -- kubectl apply -f k8sallowedrepos.yaml
constrainttemplate.templates.gatekeeper.sh/k8sallowedrepos configured

We have a template but no OPA rules yet.

1
2
3
4
$ aws-vault exec home -- kubectl logs pod/gatekeeper-controller-manager-0 -n gatekeeper-system
{"level":"info","ts":1575315679.2535567,"logger":"controller","msg":"Audit opa.Audit() audit results","metaKind":"audit","violations":0}
{"level":"info","ts":1575315679.255231,"logger":"controller","msg":"constraint","metaKind":"audit","resource kind":"k8sallowedrepos"}
{"level":"info","ts":1575315679.259529,"logger":"controller","msg":"constraint","metaKind":"audit","count of constraints":0}

Let’s add a rule to deny anything that’s not from alpine (ironically the opposite of secure):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
$ cat <<EOF >check_repo.yaml
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: k8sallowedrepos
metadata:
  name: default-repo-is-wrong
spec:
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Pod"]
    namespaces:
      - "default"
  parameters:
    repos:
      - "alpine"
EOF
$ aws-vault exec home -- kubectl apply -f check_repo.yaml
k8sallowedrepos.constraints.gatekeeper.sh/default-repo-is-wrong created

We can now see Gatekeeper has it:

1
2
3
...snip noise...
{"level":"info","ts":1575315769.4250739,"logger":"controller","msg":"constraint","metaKind":"audit","count of constraints":1}
...snip noise...

Let’s push up a Helm chart that violates it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
$ cat charts/cloud-custodian/values.yaml  | grep repos
  repository: 11111111111.dkr.ecr.us-east-1.amazonaws.com/cloud-custodian
$ helm package charts/cloud-custodian/
Successfully packaged chart and saved it to: charts/cloud-custodian-0.1.2.tgz
$ aws-vault exec home -- helm install --name cloud-custodian charts/cloud-custodian-0.1.2.tgz
NAME:   cloud-custodian
LAST DEPLOYED: Mon Dec  2 13:44:47 2019
NAMESPACE: default
STATUS: DEPLOYED

RESOURCES:
==> v1/Deployment
NAME             READY  UP-TO-DATE  AVAILABLE  AGE
cloud-custodian  0/1    0           0          0s

==> v1/ServiceAccount
NAME             SECRETS  AGE
cloud-custodian  1        0s

Did it work?

1
2
3
4
5
6
7
8
9
$ aws-vault exec home -- kubectl get all
NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
service/kubernetes   ClusterIP   172.20.0.1   <none>        443/TCP   6h25m

NAME                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/cloud-custodian   0/1     0            0           18s

NAME                                         DESIRED   CURRENT   READY   AGE
replicaset.apps/cloud-custodian-744bd4768d   1         0         0       18s

It’s not running. let’s look:

1
2
3
4
5
$ aws-vault exec home -- kubectl describe replicaset.apps/cloud-custodian-744bd4768d | tail -n 4
Events:
  Type     Reason        Age                From                   Message
  ----     ------        ----               ----                   -------
  Warning  FailedCreate  3s (x14 over 44s)  replicaset-controller  Error creating: admission webhook "validation.gatekeeper.sh" denied the request: [denied by default-repo-is-wrong] container <cloud-custodian> has an invalid image repo <11111111111.dkr.ecr.us-east-1.amazonaws.com/cloud-custodian:latest>, allowed repos are ["alpine"]

And just like that we enforced a policy that prevented this from running!

More to come from OPA as I get more comfortable with it and learn how to test the policies before they go into K8S!