GCP: Enable Policy Controller on a GKE cluster

Anthos Policy Controller enables enforcement of compliance, security, and organizational policies on GKE clusters.

These might be best-practice policies coming from internal Architectural standards, or technical policies used to define/constrain resources, or audit requirements stemming from legal regulation.

Anthos Policy Controller is built upon the open-source Open Policy Agent (OPA) Gatekeeper, which uses a Kubernetes validating webhook to enforce/audit your workloads according to the defined policies.

Prerequisites

To complete this article you need a standard GKE cluster, valid KUBECONFIG environment variable, and the kubectl binary installed.

Enable ACM on GKE cluster

In order deploy Policy Controller, you need to enable the Anthos Config Management feature on the GKE cluster.  This means you need to enable the api services and register the cluster.

Setup variables

# check your current permissions, must respond 'yes'
kubectl auth can-i '*' '*' --all-namespaces

# get cluster list and region
gcloud container clusters list

# set variables
project_id=$(gcloud config get-value project)
cluster_name="<thecluster>"
cluster_location="<theregion>"
# set either 'region' or 'zone'
location_flag="--region <theregion>"

Enable relevant Google API

# enable API services
gcloud services enable anthos.googlepis.com container.googleapis.com gkeconnect.googleapis.com gkehub.googleapis.com cloudresourcemanager.googleapis.com iam.googleapis.com krmapihosting.googleapis.com

# enable config management feature
gcloud beta container hub config-management enable

Register cluster

# list of clusters already registered
gcloud container hub memberships list

# register cluster
KUBECONFIG=/tmp/kubeconfig-membership gcloud container hub memberships register $cluster_name --gke-cluster=$cluster_location/$cluster_name --enable-workload-identity

Install Policy Controller

As described in the documentation, we need to tailor a yaml file that describes our Policy Controller settings that can be applied via gcloud (not kubectl!).

# apply-spec.yaml
applySpecVersion: 1
spec:
  configSync:
    enabled: false # would need to be true if also using Config Sync
  policyController:
    enabled: true
    # Uncomment to prevent the template library from being installed
    templateLibraryInstalled: true
    # Uncomment to enable support for referential constraints
    # referentialRulesEnabled: true
    # Uncomment to disable audit, adjust value to set audit interval
    # auditIntervalSeconds: 0
    # Uncomment to log all denies and dryrun failures
    logDeniesEnabled: true
    # Uncomment to exempt namespaces
    exemptableNamespaces: ["no-mans-land"]

Run the ‘gcloud container hub’ command to apply the yaml configuration.

# view membership names, expect match with cluster name
gcloud container hub memberships list

# apply configuration, assuming membership name matches cluster name
gcloud beta container hub config-management apply --membership=$cluster_name --config=apply-spec.yaml --project=$project_id

If you want to also configure ACM Config Sync, you can read my previous article.

Validate Policy Controller installation

The Policy Controller supporting objects should now be created.

# check namespace
$ kubectl get ns gatekeeper-system --show-labels
NAME STATUS AGE LABELS
gatekeeper-system Active 21m admission.gatekeeper.sh/ignore=no-self-managing,configmanagement.gke.io/configmanagement=config-management,control-plane=controller-manager,gatekeeper.sh/system=yes,k8s-app=kubernetes-config-management,kubernetes.io/metadata.name=gatekeeper-system,policycontroller.configmanagement.gke.io=true

# validate controller deployment
$ kubectl get deployment gatekeeper-controller-manager -n gatekeeper-system
NAME READY UP-TO-DATE AVAILABLE AGE
gatekeeper-controller-manager 1/1 1 1 21m

# check the gatekeeper version
$ kubectl get deployments -n gatekeeper-system gatekeeper-controller-manager -o="jsonpath={.spec.template.spec.containers[0].image}{'\n'}"
gcr.io/config-management-release/gatekeeper:anthos1.10.2-8393e15.g0

# check for validating webhook
$ kubectl get Validatingwebhookconfigurations gatekeeper-validating-webhook-configuration
NAME WEBHOOKS AGE
gatekeeper-validating-webhook-configuration 2 18m

# list policy controller templates
$ kubectl get constrainttemplates
NAME AGE
allowedserviceportname 11m
asmauthzpolicydisallowedprefix 11m
asmauthzpolicyenforcesourceprincipals 11m
..

Define your first constraint policy

Straight from the official documentation, create a namespace constraint policy that forces all newly created namespaces to have a label named “geo”.

# ns-must-have-geo.yaml
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
  name: ns-must-have-geo
spec:
  enforcementAction: deny # 'dryrun' for warning
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Namespace"]
  parameters:
    labels:
      - key: "geo"

Apply and check enforcement status

# apply constraint policy
$ kubectl apply -f ns-must-have-geo.yaml 
k8srequiredlabels.constraints.gatekeeper.sh/ns-must-have-geo created

# 'enforced' should be set to true
$ kubectl get K8srequiredlabels ns-must-have-geo -o="jsonpath={.status.byPod[*]}" | jq
{
  "constraintUID": "cc4ac4d5-93ea-4acd-8bd6-05fbb60c603f",
  "enforced": true,
  "id": "gatekeeper-audit-7cdffcd49d-zwr6q",
  "observedGeneration": 1,
  "operations": [
    "audit",
    "status"
  ]
}
{
  "constraintUID": "cc4ac4d5-93ea-4acd-8bd6-05fbb60c603f",
  "enforced": true,
  "id": "gatekeeper-controller-manager-54876d8bcf-v8s5w",
  "observedGeneration": 1,
  "operations": [
    "webhook"
  ]
}

Test enforcement of namespace policy

Let’s test this constraint by attempting to create a namespace that violates the policy, then one that conforms.

# no label provided, violates constraint
$ kubectl create ns my-bad-ns
Error from server (Forbidden): admission webhook "validation.gatekeeper.sh" denied the request: [ns-must-have-geo] you must provide labels: {"geo"}

# namespace that properly conforms to policy
$ kubectl create namespace my-good-ns --dry-run=client -o yaml |  sed  '/^metadata:/a\ \ labels: {"geo":"true"}' | kubectl apply -f -
namespace/my-good-ns created

Audit of violations

You can view the violations on this constraint policy.

# number of violations
$ kubectl get K8sRequiredLabels ns-must-have-geo -o jsonpath="{.status.totalViolations}{'\n'}"
1

# show each violation in detail
$ kubectl get K8sRequiredLabels ns-must-have-geo -o jsonpath="{.status.violations}" | jq
[
  {
    "enforcementAction": "dryrun",
    "kind": "Namespace",
    "message": "you must provide labels: {\"geo\"}",
    "name": "kube-node-lease"
  },
...
]

# and to view the audit reports in the logs
$ kubectl logs -n gatekeeper-system -l gatekeeper.sh/system=yes

Reporting constraint without deny

It is also possible to have constraints that are evaluated, but only warn of their violation instead of denying the action.  In the “ns-must-have-geo.yaml” file created earlier, change the enforcementAction attribute to ‘dryrun’ and reapply.

# change action
sed -i 's/enforcementAction: .*/enforcementAction: dryrun/' ns-must-have-geo.yaml

# reapply to warn, but not deny violations
kubectl apply -f ns-must-have-geo.yaml

Then test this change by creating a namespace that violates the constraint.

# create namespace without labels
$ kubectl create ns my-bad-ns
namespace/my-bad-ns created

# prove namespace was created
$ kubectl get ns my-bad-ns
NAME STATUS AGE
my-bad-ns Active 2m45s

# violation warning is present in logs
$ kubectl logs -n gatekeeper-system -l gatekeeper.sh/system=yes | grep my-bad-ns | jq
{
"severity": "INFO",
"ts": 1650981374.9480886,
"logger": "webhook",
"msg": "denied admission",
"process": "admission",
"event_type": "violation",
"constraint_name": "ns-must-have-geo",
"constraint_group": "constraints.gatekeeper.sh",
"constraint_api_version": "v1beta1",
"constraint_kind": "K8sRequiredLabels",
"constraint_action": "dryrun",
"resource_group": "",
"resource_api_version": "v1",
"resource_kind": "Namespace",
"resource_namespace": "",
"resource_name": "my-bad-ns",
"request_username": "me@my-gkeproj1-10942.iam.gserviceaccount.com"
}

As you can see, this constraint is evaluated and audited, but is not denied.

REFERENCES

google ref, Anthos Policy Controller overview

google ref, Policy controller auditing user constraints\

google ref, constraint template library list

google ref, stopping policy controller

Open Policy Agent, OPA Gatekeeper

OPA introduction of goals

google ref, ACM installation