-
Notifications
You must be signed in to change notification settings - Fork 43
Mission: Health Check
ID | Short Name |
---|---|
|
|
The purpose of this use case is to demonstrate how the Kubernetes health check works in order to determine if a container is still alive (= liveness) and ready to serve (= readiness) the traffic for the application’s HTTP endpoints.
To demonstrate this behavior, we will configure a /health HTTP endpoint which is used by Kubernetes to issue HTTP requests. If the container is still alive, as the Health HTTP endpoint is able to reply, the management platform will receive 200 as return code and then no further action is required.
But, if the HTTP endpoint doesn’t return a response (JVM no longer running, thread blocked, etc), then the platform will kill the pod and recreate a new container to restart the application.
As the pod will be down for a certain period of time, we will be able to show that the endpoint exposing the service is no longer available; in this case, an HTTP 503
response will be returned. The user gets this return code from the Kubernetes proxy; the management platform has detected that the endpoint used to check if the container is ready to serve the traffic can’t reply. By consequence, the IP address and port of the server exposing the service will be removed from the Kubernetes proxy.
When an application is deployed top of OpenShift/Kubernetes it is important to figure out if each container is available and able to serve incoming requests. By implementing the health-check pattern, it becomes possible to monitor the health of the container and whether it is able to serve traffic.
-
Health Check using Liveness (= process is alive, JVM started) and Readiness (= ready to serve traffic) probes
-
Fail-over
-
Resilience
The runtime (SpringBoot, Swarm, Vert.x) provides the code or the jar file containing the /health
endpoint .
The use case starts when the application has been deployed into OpenShift. The user can access the application using a web page provided by the application where the following scenario will be proposed:
-
Click on the greeting service button to call
api/greeting
-
Verify that a JSON response message is received:
{"content": "Hello, World!"}
-
Click on the button
/api/killme
and wait till you will get a response timeout message displayed. -
Click again on the greeting service button.
-
Verify that you will now get a HTTP
503
response which means that the service has been removed by Kubernetes as the pod is killed and readiness probe can’t reply. -
Wait a sufficient amount of time to let the time to Kubernetes to detect that the pod is killed to recreate a new one. This value corresponds to the parameter “periodSeconds”.
-
Click the /api/greeting button again.
-
Verify that a JSON response message is received as expected {"content": "Hello, World!"}
-
Open a Unix/Windows Terminal
-
Retrieve the URL address of the route exposing the service /api/greeting from the OpenShift web console, or by using the OpenShift
oc
client and the commandoc get route/${artifactId}
-
Call the greeting service using the curl client with the command
curl http://<HOST_PORT_ADDRESS>/api/greeting
-
Verify that t a JSON response message is received
{"content": "Hello, World!"}
-
Issue another curl request in order to call the HTTP endpoint responsible to kill the server (or make the response time of the server longer than the probe value expected).
curl http://<HOST_PORT_ADDRESS>/api/killme
-
Call the REST endpoint exposing the greeting service to verify that you will now get a HTTP
503
response which means that the service has been removed by Kubernetes as the pod is killed and readiness probe can’t reply. -
Wait a sufficient amount of time to let the time to Kubernetes to detect that the pod is killed to recreate a new one. This value corresponds to the parameter “periodSeconds”
-
Call the greeting service using the curl client and the following request
curl http://<HOST_PORT_ADDRESS>/api/greeting
-
Verify that a JSON response message is received as expected
{"content": "Hello, World!"}
-
Call the
/health
endpoint to get a HTTP200
response but also the status of the health endpoint {"status":"UP"}curl http://<HOST_PORT_ADDRESS>/health
NoteThe steps 1. to 10. don’t render visually what happens behind the scenes when Kubernetes triggers if the pod is ready/alive, remove the endpoint from the Kubernetes API gateway and recreate it.
A more dynamic approach could be developed to include a video like this one: https://www.dropbox.com/s/j5747pwkzfj5o7m/kube-liveness-readiness.mov?dl=0
with the step-by-step instructions as described previously.
During nominal work, a curl or http request issued against the following service
$protocol://$hostname:$port/api/greeting
returns { "content": "Hello, World!"}
If the pod is killed (and during a period of x seconds), the same request will get as response a HTTP 503
- unavailable response
Swarm uses it’s internal feature to “suspend” the server as means to simulate a non-responsive service.
The use case will consist of:
Develop a HTTP application which expose 3 endpoints; a /api/greeting
, /health
and a /api/killme
. The greeting endpoint will return a json Hello World message while the killme
endpoint will be used to stop the server.
Create a deployment.yaml
file under the directory src/main/fabric8
. It will contain the definition of the readiness & liveness probes. They both will setup the endpoint /health
under the port 8080
. The initial delay like the period & threshold will be defined as follows:
...
livenessProbe:
failureThreshold: 3
httpGet:
path: /health
port: 8080
scheme: HTTP
initialDelaySeconds: 180
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
...
readinessProbe:
failureThreshold: 3
httpGet:
path: /health
port: 8080
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
PM |
☑ |
|
DevExp |
☑ |
|
Vert.x |
Clement Escoffier (Pending tuning of the right times to make user experience acceptable) |
☑ |
WildFly Swarm |
Heiko Braun Heiko Braun (Pending my comment on the /health protocol) |
☑ |
Spring Boot |
☑ |
|
QE |
☑ |
|
Docs |
☑ |
|
DevExp |
☑ |
|
Architect |
☑ |