Alertmanager configuration: Difference between revisions
From wikinotes
(→Routes) |
|||
(12 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
= Documentation = | |||
<blockquote> | |||
{| class="wikitable" | |||
|- | |||
| official docs || https://prometheus.io/docs/alerting/latest/configuration/ | |||
|- | |||
| sample config || https://github.com/prometheus/alertmanager#example | |||
|- | |||
|} | |||
</blockquote><!-- Documentation --> | |||
= Locations = | = Locations = | ||
<blockquote> | <blockquote> | ||
Line 14: | Line 25: | ||
Prometheus is responsible for creating the rules that issue the alerts. | Prometheus is responsible for creating the rules that issue the alerts. | ||
== Config == | |||
<blockquote> | |||
Configure prometheus with alert-rules, and to issue alerts to alertmanager. | |||
<syntaxhighlight lang="yaml"> | <syntaxhighlight lang="yaml"> | ||
# /usr/local/etc/prometheus.yml | # /usr/local/etc/prometheus.yml | ||
# issue alerts to alertmanager at localhost:9093 | |||
alerting: | alerting: | ||
alertmanagers: | alertmanagers: | ||
- static_configs: | - static_configs: | ||
- targets: ['localhost:9093'] | - targets: ['localhost:9093'] | ||
# load alert-rules defined in 'memory_rules.yml' | |||
rule_files: | |||
- "memory_rules.yml" | |||
</syntaxhighlight> | </syntaxhighlight> | ||
</blockquote><!-- Config --> | |||
== Alert Rules == | |||
<blockquote> | |||
Configure an alert-rule (see [[prometheus rules]] for more details). | |||
<syntaxhighlight lang="yaml"> | |||
# /usr/local/etc/memory_rules.yml | |||
groups: | |||
- name: available_memory | |||
rules: | |||
- alert: sustained_high_memory_usage | |||
expr: (node_memory_free_bytes / node_memory_size_bytes) > 0.9 | |||
for: 1h | |||
labels: | |||
severity: warning | |||
annotations: | |||
summary: "host {{ $labels.instance }} has sustained high memory usage" | |||
description: "{{ $labels.instance }} has had >=90% memory usage for at least an hour!" | |||
</syntaxhighlight> | |||
</blockquote><!-- Alert Rules --> | |||
</blockquote><!-- Prometheus --> | </blockquote><!-- Prometheus --> | ||
Line 31: | Line 71: | ||
<blockquote> | <blockquote> | ||
AlertManager can issue notifications using various methods.<br> | AlertManager can issue notifications using various methods.<br> | ||
See [https://prometheus.io/docs/alerting/latest/configuration/#configuration-file docs] for all options (ex. email, http, slack | See [https://prometheus.io/docs/alerting/latest/configuration/#configuration-file docs] for all options (ex. email, http, pagerduty, slack, ...) | ||
<syntaxhighlight lang="yaml"> | <syntaxhighlight lang="yaml"> | ||
# /usr/local/etc/alertmanager/alertmanager.yml | |||
global: # general settings | global: # general settings | ||
route: # root-route, where alerts enter | |||
receivers: # alerts are issued to receivers | |||
templates: # configure template locations (templates for alert messages) | templates: # configure template locations (templates for alert messages) | ||
inhibit_rules: # rules to mute alerts, when other alerts are already firing | inhibit_rules: # rules to mute alerts, when other alerts are already firing | ||
</syntaxhighlight> | </syntaxhighlight> | ||
</blockquote><!-- Overview --> | </blockquote><!-- Overview --> | ||
Line 44: | Line 87: | ||
== Routes == | == Routes == | ||
<blockquote> | <blockquote> | ||
Routes determine when/how notifications are sent, and who they are sent to. | |||
<syntaxhighlight lang="yaml"> | <syntaxhighlight lang="yaml"> | ||
# /usr/local/etc/alertmanager/alertmanager.yml | |||
route: | route: | ||
receiver: team-X-mails # default receiver for all routes | receiver: team-X-mails # default receiver for all routes | ||
Line 54: | Line 101: | ||
# (can be nested for gradually more specific rules) | # (can be nested for gradually more specific rules) | ||
routes: | routes: | ||
- | - match: | ||
service=~"ha|nginx|wsgi" # if alert's label matches regex, issue to this receiver | |||
receiver: team-X-mails | receiver: team-X-mails | ||
routes: # you can nest routes, for more specific label matches | routes: # you can recursively nest routes, for more specific label matches | ||
- | - match: | ||
severity="critical" | |||
receiver: team-X-pager | receiver: team-X-pager | ||
- | - match: | ||
# ... | # ... | ||
</syntaxhighlight> | </syntaxhighlight> | ||
</blockquote><!-- Routes --> | </blockquote><!-- Routes --> | ||
== Receivers == | |||
<blockquote> | |||
Receivers are used in <code>routes</code>, and represent a communication-method for issuing an alert.<br> | |||
ex: email, slack, pagerduty, ... | |||
<syntaxhighlight lang="yaml"> | |||
# /usr/local/etc/alertmanager/alertmanager.yml | |||
receivers: | |||
# send email | |||
- name: 'team-X-mails' | |||
email_configs: | |||
- to: 'team-X+alerts@example.org' | |||
# send email AND page w/ pagerduty | |||
- name: 'team-X-pager' | |||
email_configs: | |||
- to: 'team-X+alerts-critical@example.org' | |||
pagerduty_configs: | |||
- service_key: 'abcdefg' | |||
</syntaxhighlight> | |||
</blockquote><!-- Receivers --> | |||
</blockquote><!-- AlertManager --> | </blockquote><!-- AlertManager --> |
Latest revision as of 23:18, 29 August 2022
Documentation
official docs https://prometheus.io/docs/alerting/latest/configuration/ sample config https://github.com/prometheus/alertmanager#example
Locations
${PREFIX}/etc/alertmanager/alertmanager.yml
alertmanager config ${PREFIX}/prometheus.yml
prometheus config
Prometheus
Prometheus is responsible for creating the rules that issue the alerts.
Config
Configure prometheus with alert-rules, and to issue alerts to alertmanager.
# /usr/local/etc/prometheus.yml # issue alerts to alertmanager at localhost:9093 alerting: alertmanagers: - static_configs: - targets: ['localhost:9093'] # load alert-rules defined in 'memory_rules.yml' rule_files: - "memory_rules.yml"Alert Rules
Configure an alert-rule (see prometheus rules for more details).
# /usr/local/etc/memory_rules.yml groups: - name: available_memory rules: - alert: sustained_high_memory_usage expr: (node_memory_free_bytes / node_memory_size_bytes) > 0.9 for: 1h labels: severity: warning annotations: summary: "host {{ $labels.instance }} has sustained high memory usage" description: "{{ $labels.instance }} has had >=90% memory usage for at least an hour!"
AlertManager
Alertmanager is responsible for consuming fired-alerts, and deciding who/how/if to alert.
Overview
AlertManager can issue notifications using various methods.
See docs for all options (ex. email, http, pagerduty, slack, ...)# /usr/local/etc/alertmanager/alertmanager.yml global: # general settings route: # root-route, where alerts enter receivers: # alerts are issued to receivers templates: # configure template locations (templates for alert messages) inhibit_rules: # rules to mute alerts, when other alerts are already firingRoutes
Routes determine when/how notifications are sent, and who they are sent to.
# /usr/local/etc/alertmanager/alertmanager.yml route: receiver: team-X-mails # default receiver for all routes group_by: ['cluster', 'alertname'] # alerts batched by labels. one alert fired per-batch at a time. repeat_interval: 3h # re-issue alert after this time-interval if not resolved # optionally, you can match on alert-labels # and alter the alert/receiver # (can be nested for gradually more specific rules) routes: - match: service=~"ha|nginx|wsgi" # if alert's label matches regex, issue to this receiver receiver: team-X-mails routes: # you can recursively nest routes, for more specific label matches - match: severity="critical" receiver: team-X-pager - match: # ...Receivers
Receivers are used in
routes
, and represent a communication-method for issuing an alert.
ex: email, slack, pagerduty, ...# /usr/local/etc/alertmanager/alertmanager.yml receivers: # send email - name: 'team-X-mails' email_configs: - to: 'team-X+alerts@example.org' # send email AND page w/ pagerduty - name: 'team-X-pager' email_configs: - to: 'team-X+alerts-critical@example.org' pagerduty_configs: - service_key: 'abcdefg'