Alertmanager configuration: Difference between revisions

From wikinotes
 
(12 intermediate revisions by the same user not shown)
Line 1: Line 1:
= Documentation =
<blockquote>
{| class="wikitable"
|-
| official docs || https://prometheus.io/docs/alerting/latest/configuration/
|-
| sample config || https://github.com/prometheus/alertmanager#example
|-
|}
</blockquote><!-- Documentation -->
= Locations =
= Locations =
<blockquote>
<blockquote>
Line 14: Line 25:
Prometheus is responsible for creating the rules that issue the alerts.
Prometheus is responsible for creating the rules that issue the alerts.


== Config ==
<blockquote>
Configure prometheus with alert-rules, and to issue alerts to alertmanager.
<syntaxhighlight lang="yaml">
<syntaxhighlight lang="yaml">
# /usr/local/etc/prometheus.yml
# /usr/local/etc/prometheus.yml


# issue alerts to alertmanager at localhost:9093
alerting:
alerting:
   alertmanagers:
   alertmanagers:
     - static_configs:
     - static_configs:
       - targets: ['localhost:9093']
       - targets: ['localhost:9093']
# load alert-rules defined in 'memory_rules.yml'
rule_files:
  - "memory_rules.yml"
</syntaxhighlight>
</syntaxhighlight>
</blockquote><!-- Config -->
== Alert Rules ==
<blockquote>
Configure an alert-rule (see [[prometheus rules]] for more details).
<syntaxhighlight lang="yaml">
# /usr/local/etc/memory_rules.yml
groups:
  - name: available_memory
    rules:
    - alert: sustained_high_memory_usage
      expr:  (node_memory_free_bytes / node_memory_size_bytes) > 0.9
      for: 1h
      labels:
        severity: warning
      annotations:
        summary: "host {{ $labels.instance }} has sustained high memory usage"
        description: "{{ $labels.instance }} has had >=90% memory usage for at least an hour!"
</syntaxhighlight>
</blockquote><!-- Alert Rules -->
</blockquote><!-- Prometheus -->
</blockquote><!-- Prometheus -->


Line 31: Line 71:
<blockquote>
<blockquote>
AlertManager can issue notifications using various methods.<br>
AlertManager can issue notifications using various methods.<br>
See [https://prometheus.io/docs/alerting/latest/configuration/#configuration-file docs] for all options (ex. email, http, slack, wechat, ...)
See [https://prometheus.io/docs/alerting/latest/configuration/#configuration-file docs] for all options (ex. email, http, pagerduty, slack, ...)


<syntaxhighlight lang="yaml">
<syntaxhighlight lang="yaml">
# /usr/local/etc/alertmanager/alertmanager.yml
global:          # general settings
global:          # general settings
route:          # root-route, where alerts enter
receivers:      # alerts are issued to receivers
templates:      # configure template locations (templates for alert messages)
templates:      # configure template locations (templates for alert messages)
route:          # root-route, where alerts enter
inhibit_rules:  # rules to mute alerts, when other alerts are already firing
inhibit_rules:  # rules to mute alerts, when other alerts are already firing
receivers:      # alerts are issued to receivers
</syntaxhighlight>
</syntaxhighlight>
</blockquote><!-- Overview -->
</blockquote><!-- Overview -->
Line 44: Line 87:
== Routes ==
== Routes ==
<blockquote>
<blockquote>
Routes determine when/how notifications are sent, and who they are sent to.
<syntaxhighlight lang="yaml">
<syntaxhighlight lang="yaml">
# /usr/local/etc/alertmanager/alertmanager.yml
route:
route:
   receiver: team-X-mails              # default receiver for all routes
   receiver: team-X-mails              # default receiver for all routes
Line 54: Line 101:
   # (can be nested for gradually more specific rules)
   # (can be nested for gradually more specific rules)
   routes:
   routes:
     - matchers:
     - match:
      - service=~"ha|nginx|wsgi"      # if alert's label matches regex, issue to this receiver
        service=~"ha|nginx|wsgi"      # if alert's label matches regex, issue to this receiver
       receiver: team-X-mails
       receiver: team-X-mails
       routes:                        # you can nest routes, for more specific label matches
       routes:                        # you can recursively nest routes, for more specific label matches
         - matchers:
         - match:
          - severity="critical"
            severity="critical"
           receiver: team-X-pager
           receiver: team-X-pager
     - matchers:
     - match:
       # ...
       # ...
</syntaxhighlight>
</syntaxhighlight>
</blockquote><!-- Routes -->
</blockquote><!-- Routes -->
== Receivers ==
<blockquote>
Receivers are used in <code>routes</code>, and represent a communication-method for issuing an alert.<br>
ex: email, slack, pagerduty, ...
<syntaxhighlight lang="yaml">
# /usr/local/etc/alertmanager/alertmanager.yml
receivers:
  # send email
  - name: 'team-X-mails'
    email_configs:
      - to: 'team-X+alerts@example.org'
  # send email AND page w/ pagerduty
  - name: 'team-X-pager'
    email_configs:
      - to: 'team-X+alerts-critical@example.org'
    pagerduty_configs:
      - service_key: 'abcdefg'
</syntaxhighlight>
</blockquote><!-- Receivers -->
</blockquote><!-- AlertManager -->
</blockquote><!-- AlertManager -->

Latest revision as of 23:18, 29 August 2022

Documentation

official docs https://prometheus.io/docs/alerting/latest/configuration/
sample config https://github.com/prometheus/alertmanager#example

Locations

${PREFIX}/etc/alertmanager/alertmanager.yml alertmanager config
${PREFIX}/prometheus.yml prometheus config

Prometheus

Prometheus is responsible for creating the rules that issue the alerts.

Config

Configure prometheus with alert-rules, and to issue alerts to alertmanager.

# /usr/local/etc/prometheus.yml

# issue alerts to alertmanager at localhost:9093
alerting:
  alertmanagers:
    - static_configs:
      - targets: ['localhost:9093']

# load alert-rules defined in 'memory_rules.yml'
rule_files:
  - "memory_rules.yml"

Alert Rules

Configure an alert-rule (see prometheus rules for more details).

# /usr/local/etc/memory_rules.yml

groups:
  - name: available_memory
    rules:
    - alert: sustained_high_memory_usage
      expr:  (node_memory_free_bytes / node_memory_size_bytes) > 0.9
      for: 1h
      labels:
        severity: warning
      annotations:
        summary: "host {{ $labels.instance }} has sustained high memory usage"
        description: "{{ $labels.instance }} has had >=90% memory usage for at least an hour!"

AlertManager

Alertmanager is responsible for consuming fired-alerts, and deciding who/how/if to alert.

Overview

AlertManager can issue notifications using various methods.
See docs for all options (ex. email, http, pagerduty, slack, ...)

# /usr/local/etc/alertmanager/alertmanager.yml

global:          # general settings
route:           # root-route, where alerts enter
receivers:       # alerts are issued to receivers

templates:       # configure template locations (templates for alert messages)
inhibit_rules:   # rules to mute alerts, when other alerts are already firing

Routes

Routes determine when/how notifications are sent, and who they are sent to.

# /usr/local/etc/alertmanager/alertmanager.yml

route:
  receiver: team-X-mails              # default receiver for all routes
  group_by: ['cluster', 'alertname']  # alerts batched by labels. one alert fired per-batch at a time.
  repeat_interval: 3h                 # re-issue alert after this time-interval if not resolved

  # optionally, you can match on alert-labels
  # and alter the alert/receiver
  # (can be nested for gradually more specific rules)
  routes:
    - match:
        service=~"ha|nginx|wsgi"      # if alert's label matches regex, issue to this receiver
      receiver: team-X-mails
      routes:                         # you can recursively nest routes, for more specific label matches
        - match:
            severity="critical"
          receiver: team-X-pager
    - match:
      # ...

Receivers

Receivers are used in routes, and represent a communication-method for issuing an alert.
ex: email, slack, pagerduty, ...

# /usr/local/etc/alertmanager/alertmanager.yml

receivers:
  # send email
  - name: 'team-X-mails'
    email_configs:
      - to: 'team-X+alerts@example.org'

  # send email AND page w/ pagerduty
  - name: 'team-X-pager'
    email_configs:
      - to: 'team-X+alerts-critical@example.org'
    pagerduty_configs:
      - service_key: 'abcdefg'