Skip to content

Good practices

This document describes the different properties a NAIS application should have.

Handles termination gracefully

The application should make sure it listens to the SIGTERM signal, and prepare for shutdown (closing connections etc.) upon receival.

When running on NAIS (or Kubernetes, actually) your application must be able to handle being shut down at any given time. This is because the platform might have to reboot the node your application is running on (e.g. because of a OS patch requiring restart), and in that case will reschedule your application on a different node.

To best be able to handle this in your application, it helps to be aware of the relevant parts of the termination lifecycle.

  1. Application (pod) gets status TERMINATING, and grace period starts (default 30s)
  2. (simultaneous with 1) If the pod has a preStop hook defined, this is invoked
  3. (simultaneous with 1) The pod is removed from the list of endpoints i.e. taken out of load balancing
  4. (simultaneous with 1, but after preStop if defined) Container receives SIGTERM, and should prepare for shutdown
  5. Grace period ends, and container receives SIGKILL
  6. Pod disappears from the API, and is no longer visible for the client.

The platform will automatically add a preStop-hook that pauses the termination sufficiently that e.g. the ingress controller has time to update it's list of endpoints (thus avoid sending traffic to a application while terminating).

Exposes relevant application metrics

The application should be instrumented using Prometheus, exposing the relevant application metrics. See the metrics documentation for more info.

Writes structured logs to stdout

The application should emit json-formatted logs by writing directly to standard output. This will make it easier to index, view and search the logs later. See more details in the logs documentation.

Crashes on fatal errors

If the application reaches an unrecoverable error, you should let it crash. Instead, you should immediately exit the process and let the kubelet restart the container.

By restarting the container, you allow for the eventual readiness of other dependencies.

Implements readiness and liveness endpoints

The readiness-probe is used by Kubernetes to determine if the application should receive traffic, while the liveness-probe lets Kubernetes know if your application is alive. If it's dead, Kubernetes will remove the pod and bring up a new one. They should be implemented as separate services as they usually have different characteristics.

  • liveness-probe should simply return HTTP 200 OK if main loop is running, and HTTP 5xx if not.
  • readiness-probe returns HTTP 200 OK is able to process requests, and HTTP 5xx if not. If the application has dependencies to e.g. a database to serve traffic, it's a good idea to check if the database is available in the readiness-probe.

Useful resources on the topic: