Alert on vulnerabilities ¶

This guide shows you how to set up alerts when new critical vulnerabilities are detected in your workloads.

Nais exposes Prometheus metrics for vulnerability status and risk per workload. You can use these to alert your team via Slack or Grafana.

Prerequisites ¶

Alert on critical vulnerabilities ¶

Create a PrometheusRule in your namespace that triggers an alert when the number of critical vulnerabilities exceeds zero.

.nais/alert-vulnerabilities.yaml

yaml

Alert on high risk score ¶

You can also alert based on nais_workload_risk_score if you prefer a single aggregated alert per workload instead of one per severity level.

The risk score is calculated as:

Plaintext

A workload with 1 critical vulnerability scores 10, while 20 critical vulnerabilities score 200. Choose a threshold that matches your team's risk tolerance — 200 is a reasonable starting point and corresponds roughly to 20 critical or 40 high severity vulnerabilities.

.nais/alert-risk-score.yaml

yaml

Activate the alert ¶

Add the file to your application repository and deploy with Nais GitHub Action.

bash

Alert to a dedicated Slack channel ¶

By default, alerts are sent to your team's standard Slack channel. If you want to send vulnerability alerts to a dedicated channel, create an AlertmanagerConfig.

See Advanced Prometheus alerting for a complete example with a custom Slack channel and webhook.

Alert in Grafana ¶

Alternatively, you can create alerts directly in Grafana without deploying Kubernetes resources:

  1. Open Grafana and go to Alerting → Alert rules
  2. Click New alert rule
  3. Select the Prometheus/Mimir data source that contains nais_workload_* metrics and use e.g.:
    promql
  4. Set the threshold and connect to a Slack contact point

See Create alert in Grafana for a complete step-by-step guide.

Metric update delay after suppression

After suppressing a vulnerability, the database is updated immediately, but Prometheus metrics (nais_workload_vulnerabilities, nais_workload_risk_score) are refreshed on a periodic interval (default: 5 minutes). An active alert may therefore remain firing for up to 5 minutes after suppression.

Learn more ¶