Prometheus: installing kube-prometheus-stack on a kubeadm cluster

The kube-prometheus-stack bundles the Prometheus Operator, monitors/rules, Grafana dashboards, and AlertManager needed to monitor a Kubernetes cluster. But there are customizations necessary to tailor the Helm installation for a Kubernetes cluster built using kubeadm. In this article, I will detail the necessary modifications to deploy a healthy monitoring stack on a kubeadm cluster.

Prometheus: monitoring services using additional scrape config for Prometheus Operator

If you are running the Prometheus Operator (e.g. with kube-prometheus-stack) then you can specify additional scrape config jobs to monitor your custom services. An additional scrape config uses regex evaluation to find matching services en masse, and targets a set of services based on label, annotation, namespace, or name. Note that adding an additional scrape Prometheus: monitoring services using additional scrape config for Prometheus Operator

Prometheus: monitoring a custom Service using ServiceMonitor and PrometheusRule

If you are running the Prometheus Operator as part of your monitoring stack (e.g. kube-prometheus-stack) then you can have your custom Service monitored by defining a ServiceMonitor CRD. The ServiceMonitor is an object that defines the service endpoints that should be scraped by Prometheus and at what interval. In this article, we will deploy a Prometheus: monitoring a custom Service using ServiceMonitor and PrometheusRule

Prometheus: adding a Grafana dashboard using a ConfigMap

If your Grafana deployment is using a sidecar to watch for new dashboards defined as a ConfigMap, then adding a dashboard is a dynamic operation that can be done without even restarting the pod. If you have deployed the Prometheus/Grafana stack with kube-prometheus-stack, then you can check for the existence of the ‘grafana-sc-dashboard’ sidecar using: Prometheus: adding a Grafana dashboard using a ConfigMap

Java: build OCI compatible image for Spring Boot web app using jib

While working on your Spring Boot web application locally, gradle provides the ‘bootRun’ for a quick development lifecycle and ‘bootJar’ for packaging all the dependencies as a single jar deliverable. But for most applications these days, you will need this packaged into an OCI compatible (i.e. Docker) image for its ultimate deployment to an orchestrator Java: build OCI compatible image for Spring Boot web app using jib

Prometheus: external template for AlertManager html email with kube-prometheus-stack

The kube-prometheus-stack bundles AlertManager for taking action on Prometheus alerts. And if you are customizing the Heml custom values file to configure email alerting, there are multiple options available.  The simplest is to allow the system to fallback to using the default subject and html templates. But if you need to tailor the email content Prometheus: external template for AlertManager html email with kube-prometheus-stack

Prometheus: exposing Prometheus/Grafana as Ingress for kube-prometheus-stack

The kube-prometheus-stack bundles Prometheus, Grafana, and AlertManager for monitoring a Kubernetes cluster. By default, the Ingress of these services is disabled.  In this article I will show you how to expose these services with NGINX Ingress either via subdomain (e.g. prometheus.my.domain) or web context (e.g. my.domain/prometheus). You would not want to expose these monitoring applications Prometheus: exposing Prometheus/Grafana as Ingress for kube-prometheus-stack

Prometheus: installing kube-prometheus-stack on K3s cluster

The kube-prometheus-stack bundles the Prometheus Operator, monitors/rules, Grafana dashboards, and AlertManager needed to monitor a Kubernetes cluster. But there are customizations necessary to tailor the Helm installation for K3s, a lightweight Kubernetes installation. In this article, I will detail the necessary modifications to deploy a healthy monitoring stack on a K3s cluster.

Kubernetes: targeting the addition of array items to a multi-document yaml manifest

If you have a Kubernetes yaml manifest that contains multiple documents, targeting a single document for modification while still outputting the other documents untouched can be a challenge. As an example, consider the simple example below were you have a single yaml file that contains: a Namespace, Deployment, and DaemonSet.  And we want to add Kubernetes: targeting the addition of array items to a multi-document yaml manifest

Kubernetes: liveness probe for Spring Boot with custom Actuator health check

A Kubernetes liveness and readiness probe is how the kubelet determines health of a pod.  This is often times as simple as checking the ability to reach the main service port over TCP or HTTP. But if you are using Spring Boot and have enabled the Actuator dependency, you have the ability to create even Kubernetes: liveness probe for Spring Boot with custom Actuator health check

Java: Creating Docker image for Spring Boot web app using gradle

While working on your Spring Boot web application locally, gradle provides the ‘bootRun’ for a quick development lifecycle and ‘bootJar’ for packaging all the dependencies as a single jar deliverable. But for most applications these days, you will need this packaged into an OCI compatible (i.e. Docker) image for its ultimate deployment to an orchestrator Java: Creating Docker image for Spring Boot web app using gradle

Java: adding custom health indicator to Spring Boot Actuator

If you have enabled Actuator in your Spring Boot application, you can add custom status metrics to the standard health check at ‘/actuator/health’. Additionally, your custom health indicator can signal an UP/DOWN status that propagates to the main level status and can then be used by an external monitoring/alerting solutions or as an indicator to Java: adding custom health indicator to Spring Boot Actuator

Java: Adding custom metrics to Spring Boot Micrometer Prometheus endpoint

If you have enabled Actuator and the ‘micrometer-registry-prometheus’ dependency in your Spring Boot application, then you will have a new ‘/actuator/prometheus’ web endpoint that returns general information about threads, garbage collection, disk, and memory. This information is delivered in standard prometheus formatting as plaintext, with one metric per line. This is exactly the type of Java: Adding custom metrics to Spring Boot Micrometer Prometheus endpoint

GCP: Enable HttpLoadBalancing feature on Cluster to avoid errors when applying BackEndConfig

If you are configuring Istio/ASM ingress gateways with a BackendConfig for specifying health checks, timeouts, or Cloud Armor policies, then you need to ensure that your GKE cluster has the HttpLoadBalancing feature enabled. If this feature is not enabled, you will see an error message like below when attempting to apply the BackendConfig manifest: unable GCP: Enable HttpLoadBalancing feature on Cluster to avoid errors when applying BackEndConfig

GCP: running a container on a GKE cluster using Workload Identity

With Workload Identity enabled on a GKE cluster, your container can access Google Cloud API services (Compute Engine, Storage, etc.) using a Kubernetes Service Account (KSA). This is done by having the container run as the KSA, where the KSA has been bound to the Google Service Account (GSA).  This is the recommended way of GCP: running a container on a GKE cluster using Workload Identity

Kubernetes: retrieving services and pods network CIDR block from cluster

When configuring networks and loadbalancers, sometimes you need the network CIDR block used by Services of a Kubernetes cluster.  There are various ways to pull this information from different Kubernetes implementations, but one trick that works across implementations is looking at the error message from kubectl if you attempt to create a service at an Kubernetes: retrieving services and pods network CIDR block from cluster

GCP: Enabling autoUpgrade for node-pools to reduce manual maintenance

GKE cluster upgrades do not need to be a manual process.  GKE clusters can be auto upgraded by subscribing the cluster to an appropriate release channel and assigning a sensible maintenance window.  As long as adequate pod disruption budgets, replicas, and ingress are configured, these upgrades can happen without interrupting  availability. To check the current GCP: Enabling autoUpgrade for node-pools to reduce manual maintenance

Kubernetes: Anthos GKE on-prem 1.11 on nested VMware environment

Anthos GKE on-prem is a managed platform that brings GKE clusters to on-premise datacenters. This product offering brings best practice security measures, tested paths for upgrades, basic monitoring, platform logging, and full enterprise support. Setting up a platform this extensive requires many steps as officially documented here. However, if you want to practice in a Kubernetes: Anthos GKE on-prem 1.11 on nested VMware environment

Kubernetes: major version upgrade of Anthos GKE on-prem from 1.10 to 1.11

Anthos GKE on-prem is a managed platform that brings GKE clusters to on-premise datacenters. In this article, I will be following the steps required to upgrade from Anthos 1.10 to 1.11 on VMware. The instructions provided here are assuming you have used the Ansible scripts and Seed VM described in my previous Anthos 1.10 installation Kubernetes: major version upgrade of Anthos GKE on-prem from 1.10 to 1.11

Python: New Relic Agent for Gunicorn app deployed on Kubernetes

Gunicorn is a WSGI HTTP server commonly used to run Flask applications in production. If you are running these types of workloads on a production Kubernetes cluster, you should consider an observability platform such a New Relic to ensure availability, service levels, and visibility into transactions and logging. In a series of previous articles, we Python: New Relic Agent for Gunicorn app deployed on Kubernetes

Python: New Relic instrumentation for Flask app deployed with Gunicorn

Gunicorn is a WSGI HTTP server commonly used to run Flask applications in production.  If you are running these types of workloads in production, you should consider an observability platform such a New Relic to ensure availability, service levels, and visibility into transactions and logging. In a previous article, we created a Docker image of Python: New Relic instrumentation for Flask app deployed with Gunicorn

Python: Building an image for a Flask app served from Gunicorn

Gunicorn is a WSGI HTTP server commonly used to run Flask applications in production.  Running Flask applications directly is great for development and testing of the basic request/response flow, but you need gunicorn to handle production level loads,  concurrency, logging, and timeouts. In this article, I will show you how to build a Docker image Python: Building an image for a Flask app served from Gunicorn

GCP: Moving a VM instance to a different region using snapshots

The ‘gcloud compute instances move‘ command is convenient for moving VM instances from one region to another, but only works within a narrow scope of OS image types and disks. For example, only older non-UEFI OS images can be moved with this command. Trying to move even the simplest Ubuntu bionic/focal or Debian bullseye/buster VM GCP: Moving a VM instance to a different region using snapshots

GCP: Enable Policy Controller on a GKE cluster

Anthos Policy Controller enables enforcement of compliance, security, and organizational policies on GKE clusters. These might be best-practice policies coming from internal Architectural standards, or technical policies used to define/constrain resources, or audit requirements stemming from legal regulation. Anthos Policy Controller is built upon the open-source Open Policy Agent (OPA) Gatekeeper, which uses a Kubernetes GCP: Enable Policy Controller on a GKE cluster

Ubuntu: install latest git client from PPA to fix ‘unsafe repository’ errors

Since the announcement of CVE-2022-24765, newer git clients from the Ubuntu security and archive package repositories may throw errors about “unsafe repository … is owned by someone else” if directories are not owned by your personal user id. First, try to resolve the issue by running the command suggested in the error message. # attempt Ubuntu: install latest git client from PPA to fix ‘unsafe repository’ errors

Kubernetes: kustomize with Helm charts

kustomize is typically used to overlay a base set of yaml, but it also has the ability to leverage existing Helm charts, and overlay a set of custom values with HelmChartInflationGenerator. In this article, I will use kustomize to deploy the Bitnami NGINX Helm chart with overridden values that provide a customized nginx.conf and custom Kubernetes: kustomize with Helm charts

Kubernetes: kustomize transformations with patchesStrategicMerge

The power of kustomize lies in its ability to transform yaml, and one of the methods for this is  patchesStrategicMerge. Where the strategic merge patch excels is in inserting elements and replacing values, allowing you to specify the desired patch using the same indentation level as the target, which makes the intended result very intuitive. Kubernetes: kustomize transformations with patchesStrategicMerge