Kubernetes: deleting a GKE node from a managed instance node pool

If you need to delete a GKE instance from a node pool, you cannot simply treat the node as a raw VM instance.  You must delete the VM instance from the managed instance group of which it is a member.

A cluster will have an instance group for each region.  In a zonal cluster, the challenge is determining which instance group your suspect node is a member of.  Below are the commands for showing all the instance groups belonging to a cluster.

# get list of clusters
gcloud container clusters list

# set variables for cluster
cluster_name=cluster-2
location_flag="--region=us-central1"

# get managed instance groups for specific cluster
$ gcloud container clusters describe $cluster_name $location_flag --format=json | jq '.instanceGroupUrls' | tr -d ' ",' | rev | cut -d/ -f1 | rev

[
gke-cluster-2-default-pool-484b01f9-grp
gke-cluster-2-default-pool-ba82bf6c-grp
gke-cluster-2-default-pool-168508f7-grp
]

But this leaves you with needing to determine which instance group your suspect node is a member of, and that can be done by querying the membership of each managed instance group.

# extracts full instance group URI
igs=$(gcloud container clusters describe $cluster_name $location_flag --format=json | jq '.instanceGroupUrls' | tr -d ' ",' | tail -n+2 | head -n-1)

# list GKE nodes in each instance group
for igline in $igs; do
  the_zone=$(echo "$igline" | grep -Po "/zones/\K(.*?)(?=/)")
  the_ig=${igline##*/}
  echo "== $the_zone,$the_ig =="
  gcloud compute instance-groups managed list-instances --zone=$the_zone $the_ig --format="value(instance)"
done

Which outputs each instance zone/group, and then its GKE node membership.

== us-central1-b,gke-cluster-2-default-pool-484b01f9-grp ==
gke-cluster-2-default-pool-484b01f9-4jdw
== us-central1-a,gke-cluster-2-default-pool-ba82bf6c-grp ==
gke-cluster-2-default-pool-ba82bf6c-r729
== us-central1-f,gke-cluster-2-default-pool-168508f7-grp ==
gke-cluster-2-default-pool-168508f7-8kj3

Now armed with all the necessary pieces of information, you can delete the GKE node instance.

gke_instance_group=gke-cluster-2-default-pool-168508f7-grp
gke_vm_instance=gke-cluster-2-default-pool-168508f7-8kj3
gke_vm_instance_zone=us-central1-f

# delete node instance from managed instance group
gcloud compute instance-groups managed delete-instances $gke_instance_group --instances=$gke_vm_instance --zone=gke_vm_instance_zone

REFERENCES

google documentation, gcloud compute instance-groups managed

Petko’s coding blog, deleting a GKE node from instance group

NOTES
show zone info for each gke node

# take note of region of your suspected blocker
# show cluster gke nodes and their zone
kubectl get nodes -o custom-columns='NAME:.metadata.name,ZONE:.metadata.labels.failure-domain\.beta\.kubernetes\.io/zone,VER:.status.nodeInfo.kubeletVersion'