2 min read

Understanding Kubernetes health probes

Understanding Kubernetes health probes

One of the most common pitfalls that I saw people fall into when they start working with K8s is having a misconfigured health probe that prevents the container from stabilising.

It often looks like this:

• The container works fine locally.

• Keeps restarting and failing on K8s with no error messages in the logs.

Understanding what's happening

When you deploy your container to K8s, the K8s engine (kubelet) needs a way to verify it’s actually doing what you’ve defined. When you define a health probe for your pod, K8s uses that probe to ensure the container is running as intended.

Why Health Checks Matter

Beside that, When you deploy an application in Kubernetes, you want to be sure it’s ready to serve requests and stays alive. You might need to do some pre-start work before serving requests like loading a ML model or opening a database connection.

How K8s use health probe

When you deploy a container on K8s. It will run through a state of events that looks something similar to this:

State flow of the K8s health check for a scheduled pod

Startup Probe

If your pod takes a long time to start, you’ll need a Startup Probe. This allows Kubernetes to wait until your pod is fully operational. For example, if you are running a JVM-based container or needs time to load an ML model, the startup probe ensures your pod has enough time without being prematurely restarted. At the same time, it tells the kubelet clearly when and if your pod failed to start.

Liveness Probe

The Liveness Probe checks if your pod is still alive. If it fails, Kubernetes restarts the container. This is the endpoint that k8s will keep hitting it to make sure your application is still healthy and keeps your application fresh, eliminating stuck or faulty states.

Readiness Probe

Even if your container is up, it might not be ready to serve requests. The Readiness Probe detects when your application is ready for traffic. If it’s not ready, Kubernetes removes the Pod from the load balancer until it passes the check.

So in the case where your container read some state, load a model or runs a db migration. You need to setup a proper startup and readiness probes that allows your pod the time it needs other wise, k8s will interrupt it and restarts it.


Anatomy of a health probe

Each health probe consist of a few attribute that allow k8s to properly check how, when and how many times it should execute the check.

Attribute Description
initialDelaySeconds Seconds to wait before performing the first check.
periodSeconds Frequency (in seconds) between checks.
timeoutSeconds Maximum time to wait for the probe response before considering it a failure.
failureThreshold Number of consecutive failures before marking the pod unhealthy/unready.
successThreshold Number of successful checks required to mark a failing pod as healthy again.
exec Command executed inside the container to verify health.
httpGet HTTP GET request to a specified path and port to check health.
tcpSocket TCP socket check on a specified port to verify container availability.

Finally

Misconfigured health probes usually lead to a very frustrating endless restarts and headaches when deploying to Kubernetes. Choosing the correct probe types and setting the correct thresholds ensures your pods have the time they need to start and be ready.