Skip to content

NAIS Alert reference

This document describes all possible configuration values in the Alert spec, commonly known as the alert.yaml file.

alerts

Type: array
Required: false

Example
spec:
  alerts:
  - action: kubectl describe pod {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace
      }}` for events, og `kubectl logs {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace
      }}` for logger
    alert: applikasjon nede
    description: App {{ $labels.app }} er nede i namespace {{ $labels.kubernetes_namespace
      }}
    documentation: https://doc.nais.io/observability/alerts/
    expr: kube_deployment_status_replicas_available{deployment="<appname>"} > 0
    for: 2m
    priority: "0"
    severity: danger
    sla: Mellom 8 og 16

alerts[].action

What human actions are needed to resolve or investigate this alert.

Type: string
Required: true

Example
spec:
  alerts:
  - action: kubectl describe pod {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace
      }}` for events, og `kubectl logs {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace
      }}` for logger
    alert: applikasjon nede
    description: App {{ $labels.app }} er nede i namespace {{ $labels.kubernetes_namespace
      }}
    documentation: https://doc.nais.io/observability/alerts/
    expr: kube_deployment_status_replicas_available{deployment="<appname>"} > 0
    for: 2m
    priority: "0"
    severity: danger
    sla: Mellom 8 og 16

alerts[].alert

The name of the alert.

Type: string
Required: true

Example
spec:
  alerts:
  - action: kubectl describe pod {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace
      }}` for events, og `kubectl logs {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace
      }}` for logger
    alert: applikasjon nede
    description: App {{ $labels.app }} er nede i namespace {{ $labels.kubernetes_namespace
      }}
    documentation: https://doc.nais.io/observability/alerts/
    expr: kube_deployment_status_replicas_available{deployment="<appname>"} > 0
    for: 2m
    priority: "0"
    severity: danger
    sla: Mellom 8 og 16

alerts[].description

Simple description of the triggered alert.

Type: string
Required: false

Example
spec:
  alerts:
  - action: kubectl describe pod {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace
      }}` for events, og `kubectl logs {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace
      }}` for logger
    alert: applikasjon nede
    description: App {{ $labels.app }} er nede i namespace {{ $labels.kubernetes_namespace
      }}
    documentation: https://doc.nais.io/observability/alerts/
    expr: kube_deployment_status_replicas_available{deployment="<appname>"} > 0
    for: 2m
    priority: "0"
    severity: danger
    sla: Mellom 8 og 16

alerts[].documentation

URL for documentation for this alert.

Type: string
Required: false

Example
spec:
  alerts:
  - action: kubectl describe pod {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace
      }}` for events, og `kubectl logs {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace
      }}` for logger
    alert: applikasjon nede
    description: App {{ $labels.app }} er nede i namespace {{ $labels.kubernetes_namespace
      }}
    documentation: https://doc.nais.io/observability/alerts/
    expr: kube_deployment_status_replicas_available{deployment="<appname>"} > 0
    for: 2m
    priority: "0"
    severity: danger
    sla: Mellom 8 og 16

alerts[].expr

Prometheus expression that triggers an alert.

Type: string
Required: true

Example
spec:
  alerts:
  - action: kubectl describe pod {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace
      }}` for events, og `kubectl logs {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace
      }}` for logger
    alert: applikasjon nede
    description: App {{ $labels.app }} er nede i namespace {{ $labels.kubernetes_namespace
      }}
    documentation: https://doc.nais.io/observability/alerts/
    expr: kube_deployment_status_replicas_available{deployment="<appname>"} > 0
    for: 2m
    priority: "0"
    severity: danger
    sla: Mellom 8 og 16

alerts[].for

Duration before the alert should trigger.

Type: string
Required: true
Pattern: ^\d+[smhdwy]$

Example
spec:
  alerts:
  - action: kubectl describe pod {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace
      }}` for events, og `kubectl logs {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace
      }}` for logger
    alert: applikasjon nede
    description: App {{ $labels.app }} er nede i namespace {{ $labels.kubernetes_namespace
      }}
    documentation: https://doc.nais.io/observability/alerts/
    expr: kube_deployment_status_replicas_available{deployment="<appname>"} > 0
    for: 2m
    priority: "0"
    severity: danger
    sla: Mellom 8 og 16

alerts[].priority

Not in use

Type: string
Required: false

Example
spec:
  alerts:
  - action: kubectl describe pod {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace
      }}` for events, og `kubectl logs {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace
      }}` for logger
    alert: applikasjon nede
    description: App {{ $labels.app }} er nede i namespace {{ $labels.kubernetes_namespace
      }}
    documentation: https://doc.nais.io/observability/alerts/
    expr: kube_deployment_status_replicas_available{deployment="<appname>"} > 0
    for: 2m
    priority: "0"
    severity: danger
    sla: Mellom 8 og 16

alerts[].severity

Alert level for Slack messages.

Type: string
Required: false
Pattern: ^$|good|warning|danger|#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})

Example
spec:
  alerts:
  - action: kubectl describe pod {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace
      }}` for events, og `kubectl logs {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace
      }}` for logger
    alert: applikasjon nede
    description: App {{ $labels.app }} er nede i namespace {{ $labels.kubernetes_namespace
      }}
    documentation: https://doc.nais.io/observability/alerts/
    expr: kube_deployment_status_replicas_available{deployment="<appname>"} > 0
    for: 2m
    priority: "0"
    severity: danger
    sla: Mellom 8 og 16

alerts[].sla

Time before the alert should be resolved.

Type: string
Required: false

Example
spec:
  alerts:
  - action: kubectl describe pod {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace
      }}` for events, og `kubectl logs {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace
      }}` for logger
    alert: applikasjon nede
    description: App {{ $labels.app }} er nede i namespace {{ $labels.kubernetes_namespace
      }}
    documentation: https://doc.nais.io/observability/alerts/
    expr: kube_deployment_status_replicas_available{deployment="<appname>"} > 0
    for: 2m
    priority: "0"
    severity: danger
    sla: Mellom 8 og 16

inhibitRules

A list of inhibit rules. Read more about it at prometheus.io/docs.

Type: array
Required: false

Example
spec:
  inhibitRules:
  - labels:
    - label
    - lebal
    sources:
      key: value
    sourcesRegex:
      key: value(.)?
    targets:
      key: value
    targetsRegex:
      key: value(.)+

inhibitRules[].labels

Labels that must have an equal value in the source and target alert for the inhibition to take effect. These are key/value pairs, where the value can be a regex.

Type: array
Required: false

Example
spec:
  inhibitRules:
  - labels:
    - label
    - lebal
    sources:
      key: value
    sourcesRegex:
      key: value(.)?
    targets:
      key: value
    targetsRegex:
      key: value(.)+

inhibitRules[].sources

Matchers for which one or more alerts have to exist for the inhibition to take effect.

Type: object
Required: false

Example
spec:
  inhibitRules:
  - labels:
    - label
    - lebal
    sources:
      key: value
    sourcesRegex:
      key: value(.)?
    targets:
      key: value
    targetsRegex:
      key: value(.)+

inhibitRules[].sourcesRegex

Regex matchers for which one or more alerts have to exist for the inhibition to take effect. These are key/value pairs.

Type: object
Required: false

Example
spec:
  inhibitRules:
  - labels:
    - label
    - lebal
    sources:
      key: value
    sourcesRegex:
      key: value(.)?
    targets:
      key: value
    targetsRegex:
      key: value(.)+

inhibitRules[].targets

Matchers that have to be fulfilled in the alerts to be muted. These are key/value pairs.

Type: object
Required: false

Example
spec:
  inhibitRules:
  - labels:
    - label
    - lebal
    sources:
      key: value
    sourcesRegex:
      key: value(.)?
    targets:
      key: value
    targetsRegex:
      key: value(.)+

inhibitRules[].targetsRegex

Regex matchers that have to be fulfilled in the alerts to be muted. These are key/value pairs, where the value can be a regex.

Type: object
Required: false

Example
spec:
  inhibitRules:
  - labels:
    - label
    - lebal
    sources:
      key: value
    sourcesRegex:
      key: value(.)?
    targets:
      key: value
    targetsRegex:
      key: value(.)+

receivers

A list of notification recievers. You can use one or more of: e-mail, slack, sms. There needs to be at least one receiver.

Type: object
Required: false

Example
spec:
  receivers:
    email:
      to: myteam@nav.no
    slack:
      channel: '#alert-channel'
      icon_emoji: ':chart_with_upwards_trend:'
      icon_url: http://lorempixel.com/48/48
      prependText: Oh noes!
      send_resolved: true
      username: Alertmanager
    sms:
      recipients: "12345678"
      send_resolved: false

receivers.email

Alerts via e-mails

Type: object
Required: false

Example
spec:
  receivers:
    email:
      to: myteam@nav.no

receivers.email.send_resolved

Whether or not to notify about resolved alerts.

Type: boolean
Required: false

receivers.email.to

Type: string
Required: true

Example
spec:
  receivers:
    email:
      to: myteam@nav.no

receivers.slack

Slack notifications are sent via Slack webhooks.

Type: object
Required: false

Example
spec:
  receivers:
    slack:
      channel: '#alert-channel'
      icon_emoji: ':chart_with_upwards_trend:'
      icon_url: http://lorempixel.com/48/48
      prependText: Oh noes!
      send_resolved: true
      username: Alertmanager

receivers.slack.channel

The channel or user to send notifications to. Can be specified with and without #.

Type: string
Required: true

Example
spec:
  receivers:
    slack:
      channel: '#alert-channel'

receivers.slack.icon_emoji

Emoji to use as the icon for this message

Type: string
Required: false

Example
spec:
  receivers:
    slack:
      icon_emoji: ':chart_with_upwards_trend:'

receivers.slack.icon_url

URL to an image to use as the icon for this message

Type: string
Required: false

Example
spec:
  receivers:
    slack:
      icon_url: http://lorempixel.com/48/48

receivers.slack.prependText

Text to prepend every Slack message with severity danger.

Type: string
Required: false

Example
spec:
  receivers:
    slack:
      prependText: Oh noes!

receivers.slack.send_resolved

Whether or not to notify about resolved alerts.

Type: boolean
Required: false

Example
spec:
  receivers:
    slack:
      send_resolved: true

receivers.slack.username

Set your bot's user name.

Type: string
Required: false

Example
spec:
  receivers:
    slack:
      username: Alertmanager

receivers.sms

Alerts via SMS

Type: object
Required: false

Example
spec:
  receivers:
    sms:
      recipients: "12345678"
      send_resolved: false

receivers.sms.recipients

Type: string
Required: true

Example
spec:
  receivers:
    sms:
      recipients: "12345678"

receivers.sms.send_resolved

Whether or not to notify about resolved alerts.

Type: boolean
Required: false

Example
spec:
  receivers:
    sms:
      send_resolved: false

route

Type: object
Required: false

Example
spec:
  route:
    group_by:
    - <label_name>
    groupInterval: 5m
    groupWait: 30s
    repeatInterval: 3h

route.groupInterval

How long to wait before sending a notification about new alerts that are added to a group of alerts for which an initial notification has already been sent. (Usually ~5m or more.)

Type: string
Required: false
Pattern: ([0-9]+(ms|[smhdwy]))?

Example
spec:
  route:
    groupInterval: 5m

route.groupWait

How long to initially wait to send a notification for a group of alerts. Allows to wait for an inhibiting alert to arrive or collect more initial alerts for the same group. (Usually ~0s to few minutes.)

Type: string
Required: false
Pattern: ([0-9]+(ms|[smhdwy]))?

Example
spec:
  route:
    groupWait: 30s

route.group_by

The labels by which incoming alerts are grouped together. For example, multiple alerts coming in for cluster=A and alertname=LatencyHigh would be batched into a single group. To aggregate by all possible labels use '...' as the sole label name. This effectively disables aggregation entirely, passing through all alerts as-is. This is unlikely to be what you want, unless you have a very low alert volume or your upstream notification system performs its own grouping. Example: group_by: [...]

Type: array
Required: false

Example
spec:
  route:
    group_by:
    - <label_name>

route.repeatInterval

How long to wait before sending a notification again if it has already been sent successfully for an alert. (Usually ~3h or more).

Type: string
Required: false
Pattern: ([0-9]+(ms|[smhdwy]))?

Example
spec:
  route:
    repeatInterval: 3h