NAIS Alert reference¶
This document describes all possible configuration values in the Alert
spec, commonly known as the alert.yaml
file.
alerts¶
Type: array
Required: false
Example
spec:
alerts:
- action: kubectl describe pod {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace }}` for events, og `kubectl logs {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace }}` for logger
alert: applikasjon nede
description: App {{ $labels.app }} er nede i namespace {{ $labels.kubernetes_namespace }}
documentation: https://doc.nais.io/observability/alerts/
expr: kube_deployment_status_replicas_available{deployment="<appname>"} > 0
for: 2m
severity: danger
sla: Mellom 8 og 16
alerts[].action¶
What human actions are needed to resolve or investigate this alert.
Type: string
Required: true
Example
spec:
alerts:
- action: kubectl describe pod {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace }}` for events, og `kubectl logs {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace }}` for logger
alert: applikasjon nede
description: App {{ $labels.app }} er nede i namespace {{ $labels.kubernetes_namespace }}
documentation: https://doc.nais.io/observability/alerts/
expr: kube_deployment_status_replicas_available{deployment="<appname>"} > 0
for: 2m
severity: danger
sla: Mellom 8 og 16
alerts[].alert¶
The name of the alert.
Type: string
Required: true
Example
spec:
alerts:
- action: kubectl describe pod {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace }}` for events, og `kubectl logs {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace }}` for logger
alert: applikasjon nede
description: App {{ $labels.app }} er nede i namespace {{ $labels.kubernetes_namespace }}
documentation: https://doc.nais.io/observability/alerts/
expr: kube_deployment_status_replicas_available{deployment="<appname>"} > 0
for: 2m
severity: danger
sla: Mellom 8 og 16
alerts[].description¶
Simple description of the triggered alert.
Type: string
Required: false
Example
spec:
alerts:
- action: kubectl describe pod {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace }}` for events, og `kubectl logs {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace }}` for logger
alert: applikasjon nede
description: App {{ $labels.app }} er nede i namespace {{ $labels.kubernetes_namespace }}
documentation: https://doc.nais.io/observability/alerts/
expr: kube_deployment_status_replicas_available{deployment="<appname>"} > 0
for: 2m
severity: danger
sla: Mellom 8 og 16
alerts[].documentation¶
URL for documentation for this alert.
Type: string
Required: false
Example
spec:
alerts:
- action: kubectl describe pod {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace }}` for events, og `kubectl logs {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace }}` for logger
alert: applikasjon nede
description: App {{ $labels.app }} er nede i namespace {{ $labels.kubernetes_namespace }}
documentation: https://doc.nais.io/observability/alerts/
expr: kube_deployment_status_replicas_available{deployment="<appname>"} > 0
for: 2m
severity: danger
sla: Mellom 8 og 16
alerts[].expr¶
Prometheus expression that triggers an alert. Explore expressions in the Prometheus-interface
Type: string
Required: true
Example
spec:
alerts:
- action: kubectl describe pod {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace }}` for events, og `kubectl logs {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace }}` for logger
alert: applikasjon nede
description: App {{ $labels.app }} er nede i namespace {{ $labels.kubernetes_namespace }}
documentation: https://doc.nais.io/observability/alerts/
expr: kube_deployment_status_replicas_available{deployment="<appname>"} > 0
for: 2m
severity: danger
sla: Mellom 8 og 16
alerts[].for¶
Duration before the alert should trigger.
Type: string
Required: true
Pattern: ^\d+[smhdwy]$
Example
spec:
alerts:
- action: kubectl describe pod {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace }}` for events, og `kubectl logs {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace }}` for logger
alert: applikasjon nede
description: App {{ $labels.app }} er nede i namespace {{ $labels.kubernetes_namespace }}
documentation: https://doc.nais.io/observability/alerts/
expr: kube_deployment_status_replicas_available{deployment="<appname>"} > 0
for: 2m
severity: danger
sla: Mellom 8 og 16
alerts[].severity¶
Alert level for Slack messages.
Type: string
Required: false
Default value: danger
Pattern: ^$|good|warning|danger|#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})
Example
spec:
alerts:
- action: kubectl describe pod {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace }}` for events, og `kubectl logs {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace }}` for logger
alert: applikasjon nede
description: App {{ $labels.app }} er nede i namespace {{ $labels.kubernetes_namespace }}
documentation: https://doc.nais.io/observability/alerts/
expr: kube_deployment_status_replicas_available{deployment="<appname>"} > 0
for: 2m
severity: danger
sla: Mellom 8 og 16
alerts[].sla¶
Time before a human should resolve the alert.
Type: string
Required: false
Example
spec:
alerts:
- action: kubectl describe pod {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace }}` for events, og `kubectl logs {{ $labels.kubernetes_pod_name }} -n {{ $labels.kubernetes_namespace }}` for logger
alert: applikasjon nede
description: App {{ $labels.app }} er nede i namespace {{ $labels.kubernetes_namespace }}
documentation: https://doc.nais.io/observability/alerts/
expr: kube_deployment_status_replicas_available{deployment="<appname>"} > 0
for: 2m
severity: danger
sla: Mellom 8 og 16
inhibitRules¶
A list of inhibit rules. An inhibition rule mutes an alert (target) matching a set of matchers when an alert (source) exists that matches another set of matchers. Both target and source alerts must have the same label values for the label names in the labels list.
Relevant information:
Type: array
Required: false
Example
inhibitRules[].labels¶
Labels that must have an equal value in the source and target alert for the inhibition to take effect.
Type: array
Required: false
Example
inhibitRules[].sources¶
Matchers for which one or more alerts have to exist for the inhibition to take effect. These are key/value pairs.
Type: object
Required: false
Example
inhibitRules[].sourcesRegex¶
Regex matchers for which one or more alerts have to exist for the inhibition to take effect. These are key/value pairs, where the value can be a regex.
Type: object
Required: false
Example
inhibitRules[].targets¶
Matchers that have to be fulfilled in the alerts to be muted. These are key/value pairs.
Type: object
Required: false
Example
inhibitRules[].targetsRegex¶
Regex matchers that have to be fulfilled in the alerts to be muted. These are key/value pairs, where the value can be a regex.
Type: object
Required: false
Example
receivers¶
A list of notification recievers. You can use one or more of: e-mail, slack, sms. There needs to be at least one receiver.
Type: object
Required: false
Example
spec:
receivers:
email:
to: myteam@nav.no
slack:
channel: '#alert-channel'
icon_emoji: ':chart_with_upwards_trend:'
icon_url: http://lorempixel.com/48/48
prependText: Oh noes!
send_resolved: true
username: Alertmanager
sms:
recipients: "12345678"
send_resolved: false
webhook:
http_config:
proxy_url: webproxy.nav
tls_config:
insecure_skip_verify: true
max_alerts: 0
send_resolved: true
url: https://the.feature.now
receivers.email¶
Alerts via e-mails
Type: object
Required: false
receivers.email.send_resolved¶
Whether or not to notify about resolved alerts.
Type: boolean
Required: false
Default value: false
receivers.email.to¶
Type: string
Required: true
receivers.slack¶
Slack notifications are sent via Slack webhooks.
Type: object
Required: false
Example
receivers.slack.channel¶
The channel or user to send notifications to. Can be specified with and without #
.
Type: string
Required: true
receivers.slack.icon_emoji¶
Emoji to use as the icon for this message
Type: string
Required: false
receivers.slack.icon_url¶
URL to an image to use as the icon for this message
Type: string
Required: false
receivers.slack.prependText¶
Text to prepend every Slack message with severity danger
.
Type: string
Required: false
receivers.slack.send_resolved¶
Whether or not to notify about resolved alerts.
Type: boolean
Required: false
Default value: true
receivers.slack.username¶
Set your bot's user name.
Type: string
Required: false
receivers.sms¶
Alerts via SMS
Type: object
Required: false
receivers.sms.recipients¶
Type: string
Required: true
receivers.sms.send_resolved¶
Whether or not to notify about resolved alerts.
Type: boolean
Required: false
Default value: true
receivers.webhook¶
Alerts via custom web application
Type: object
Required: false
Example
receivers.webhook.http_config¶
A http_config allows configuring the HTTP client that the receiver uses to communicate with HTTP-based API services.
Type: object
Required: false
Example
receivers.webhook.http_config.proxy_url¶
Optional proxy URL.
Type: string
Required: false
receivers.webhook.http_config.tls_config¶
Configures the TLS settings.
Type: object
Required: false
receivers.webhook.http_config.tls_config.insecure_skip_verify¶
Disable validation of the server certificate.
Type: boolean
Required: false
Default value: false
receivers.webhook.max_alerts¶
The maximum number of alerts to include in a single webhook message. Alerts above this threshold are truncated. When leaving this at its default value of 0, all alerts are included.
Type: integer
Required: true
Default value: 0
receivers.webhook.send_resolved¶
Whether or not to notify about resolved alerts.
Type: boolean
Required: false
Default value: true
receivers.webhook.url¶
The endpoint to send HTTP POST requests to.
Type: string
Required: true
route¶
Type: object
Required: false
route.groupInterval¶
How long to wait before sending a notification about new alerts that are added to a group of alerts for which an initial notification has already been sent.
Type: string
Required: false
Default value: 5m
Pattern: ((([0-9]+)y)?(([0-9]+)w)?(([0-9]+)d)?(([0-9]+)h)?(([0-9]+)m)?(([0-9]+)s)?(([0-9]+)ms)?|0)
route.groupWait¶
How long to initially wait to send a notification for a group of alerts. Allows to wait for an inhibiting alert to arrive or collect more initial alerts for the same group.
Type: string
Required: false
Default value: 10s
Pattern: ((([0-9]+)y)?(([0-9]+)w)?(([0-9]+)d)?(([0-9]+)h)?(([0-9]+)m)?(([0-9]+)s)?(([0-9]+)ms)?|0)
route.group_by¶
The labels by which incoming alerts are grouped together.
Type: array
Required: false
route.repeatInterval¶
How long to wait before sending a notification again if it has already been sent successfully for an alert.
Type: string
Required: false
Default value: 1h
Pattern: ((([0-9]+)y)?(([0-9]+)w)?(([0-9]+)d)?(([0-9]+)h)?(([0-9]+)m)?(([0-9]+)s)?(([0-9]+)ms)?|0)
Created: 2021-06-30