Prometheus: installing kube-prometheus-stack on K3s cluster

The kube-prometheus-stack bundles the Prometheus Operator, monitors/rules, Grafana dashboards, and AlertManager needed to monitor a Kubernetes cluster.

But there are customizations necessary to tailor the Helm installation for K3s, a lightweight Kubernetes installation.

In this article, I will detail the necessary modifications to deploy a healthy monitoring stack on a K3s cluster.

Prerequisites

K3s cluster

You must have a healthy rancher K3s cluster.

If you follow the instructions in my K3S cluster on Ubuntu using Terraform article, then you will have a cluster like below, with 192.168.122.213 as the master running the management components.

Helm

If you do not have Helm3 installed, follow my article on Helm installation here.

External Kubernetes Storage Class

To support persistence and scheduling of the monitoring components across any node in the cluster, you need to provide an external Kubernetes storageclass.  One that is accessible from any node in the cluster and has independent persistence.

You could use a full-fledged cluster storage solution like longhorn, but I will prefer the nfs-subdir-external-provisioner from my article here. This creates a new ‘nfs-client’ storageclass that uses an NFS export from your Host machine.

K3s cluster preparation

By default, K3s binds several of its management components to the localhost 127.0.0.1 address of the VM, specifically: Kube Controller Manager, Kube Proxy, and Kube Scheduler.

However, for monitoring we need these endpoints exposed so their metrics can be pulled by Prometheus.  Therefore we need to expose these components on their 0.0.0.0 address.

You can change these settings on the K3s master by placing a file with the content below at “/etc/rancher/k3s/config.yaml”.

kube-controller-manager-arg:
- "address=0.0.0.0"
- "bind-address=0.0.0.0"
kube-proxy-arg:
- "metrics-bind-address=0.0.0.0"
kube-scheduler-arg:
- "address=0.0.0.0"
- "bind-address=0.0.0.0"
# Controller Manager exposes etcd sqllite metrics
etcd-expose-metrics: true

Then restart the k3s service.

sudo systemctl status k3s

There will now be listeners available for each of these services, which makes it scrapable by Prometheus.

# kubeControllerManager is on port 10252
# kubeScheduler is on port 10251
# kubeProxy is on port 10249

$ sudo netstat -tulnp | grep -e 10252 -e 10251 -e 10249
tcp6 0 0 :::10251 :::* LISTEN 13122/k3s server 
tcp6 0 0 :::10252 :::* LISTEN 13122/k3s server 
tcp6 0 0 :::10249 :::* LISTEN 13122/k3s server

# sanity test that metrics are available
curl -k http://localhost:10252/metrics

Helm custom values file

Before we install the kube-prometheus-stack with Helm, we need to create a custom values file to adjust the default chart values to the K3s cluster.

disable etcd

K3s has a sqllite backed etcd implementation.  This means we cannot monitor it as expected by default, so we need to disable these components.

defaultRules:
  rules:
    etcd: false

kubeEtcd:
  enabled: false

Override management component endpoints and ports

The config.yaml we placed on the master K3s node (192.168.122.213) exposed the Controller manager metrics on port 10252, and we need to specify that endpoint address and ports so Prometheus knows where to scrape.

Similarly, we must specify the endpoint IP and port for the Scheduler and Proxy.  Without these explicit settings, the monitoring stack would not be able to find these metrics.

Using the loopback 127.0.01 results in a Helm installation error and cannot be used.  Hostnames are also rejected, you are only allowed to use IP for the endpoint.  Notice that ‘endpoints’ is an array, so if you had 3 HA masters, you would specify all 3 IP addresses.

# matched to service port 'prom-stack-kube-prometheus-kube-controller-manager' -n kube-system
kubeControllerManager:
  enabled: true
  endpoints: ['192.168.122.213']
  service:
    enabled: true
    port: 10252
    targetPort: 10252
  serviceMonitor:
    enabled: true
    https: false

# matched to service port 'prom-stack-kube-prometheus-kube-scheduler' -n kube-system
kubeScheduler:
  enabled: true
  endpoints: ['192.168.122.213']
  service:
    enabled: true
    port: 10251
    targetPort: 10251
  serviceMonitor:
    enabled: true
    https: false

# matched to service port 'prom-stack-kube-prometheus-kube-proxy' -n kube-system
kubeProxy:
  enabled: true
  endpoints: ['192.168.122.213']
  service:
    enabled: true
    port: 10249
    targetPort: 10249

External storage

Since this article is about K3s clusters, be sure to use external storage for AlertManager and Prometheus.  If you do not, an emptyDir will be used which only lasts for the pod lifetime.

Even using the K3s local-path storage class only provides node level durability, and when you have a multi-node cluster this will not be sufficient.

The storage can be set at alertmanager.alertmanagerSpec.storage as shown here, and prometheus.prometheusSpec.storageSpec as shown here.

Full custom values files

I have a full example values file for a K3s cluster at prom-sparse.expanded.yaml.

I have included settings for ingress and exposing AlertManager, Grafana, and Prometheus at a subdomain but these are not specific to K3s.

Run Helm installation

Add the Helm repository.

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts

# validate helm repo was added
helm repo list

# make sure we have the latest chart
helm repo update prometheus-community

Install the chart using Helm.

# create the namespace
kubectl create ns prom

# install
helm install \
  --namespace prom \
  -f prom-sparse.yaml \
  prom-stack prometheus-community/kube-prometheus-stack

If you modify the custom values file and need to update the release, change ‘install’ to ‘upgrade’.

helm upgrade \
  --namespace prom \
  -f prom-sparse.yaml \
  prom-stack prometheus-community/kube-prometheus-stack

Validate status of installation.

# releases in all namespaces
helm list -A
# releases just in 'prom' namespace
helm list -n prom

# check status of prometheus stack release
helm status prom-stack -n prom

 

REFERENCES

rancher.com, k3s docs

rancher.com, settings for master

rancher.com, settings for workers

longhorn.io, alternative storageclass for cluster

github, fully populated values.yaml for kube-prometheus-stack

kruschecompany, prometheus operator with helm

digitalocean, prometheus operator setup

fosstechnix, prometheus and grafana without operator

github, docs for prometheus operator, describes ‘externalLabels’

joekreager, prometheus operator and blackbox

github, kube-state-metrics values.yaml, subchart of kube-prometheus-stack

sysdig, monitoring of etcd with scrape config

 

NOTES

show values available to chart

helm show values prometheus-community/kube-prometheus-stack

show port values being used by K3s embedded components

sudo journalctl -u k3s --no-pager | grep 'Running kube-scheduler ' | tail -n1 | grep -Po '\-\-bind-address=([^ ]*)|\-\-secure-port=(\d*)'
sudo journalctl -u k3s --no-pager | grep 'Running kube-controller-manager ' | tail -n1 | grep -Po '\-\-bind-address=([^ ]*)|\-\-secure-port=(\d*)'
sudo journalctl -u k3s --no-pager | grep 'Running kubelet ' | tail -n1 | grep -Po '\-\-address=(.*?) '
sudo journalctl -u k3s --no-pager | grep 'Running cloud-controller-manager ' | tail -n1 | grep -Po '\-\-bind-address=([^ ]*)'
sudo journalctl -u k3s --no-pager | grep 'Running kube-apiserver ' | tail -n1 | grep -Po '\-\-advertise-address=([^ ]*) |\-\-advertise-port=(\d*)'