Prometheus rules: Difference between revisions
From wikinotes
No edit summary |
No edit summary |
||
Line 38: | Line 38: | ||
= Alert Rules = | = Alert Rules = | ||
<blockquote> | <blockquote> | ||
<syntaxhighlight lang="yaml"> | |||
<syntaxhighlight lang="yaml"> | |||
# /usr/local/etc/memory_alerts.yml | |||
groups: | |||
- name: available_memory | |||
rules: | |||
- alert: sustained_high_memory_usage | |||
expr: (node_memory_free_bytes / node_memory_size_bytes) > 0.9 | |||
for: 1h | |||
labels: | |||
severity: warning | |||
annotations: | |||
summary: "host {{ $labels.instance }} has sustained high memory usage" | |||
description: "{{ $labels.instance }} has had >=90% memory usage for at least an hour!" | |||
</syntaxhighlight> | |||
</syntaxhighlight> | |||
</blockquote><!-- Alert Rules --> | </blockquote><!-- Alert Rules --> |
Revision as of 02:43, 18 February 2022
Rules can be used to create new metrics, or to trigger alerts.
Documentation
official docs https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/
Recording Rules (custom metrics)
# /usr/local/etc/memory_rules.yml groups: - name: available_memory rules: - record: node_available_memory_percent # metric rule-result is exposed as expr: node_memory_free_bytes / node_memory_size_bytes # PromQL query rule performs# /usr/local/etc/prometheus.yml rule_files: - "memory_rules.yml"service prometheus restartnow
node_available_memory_percent
metric should be cached and queryable.
Alert Rules
<syntaxhighlight lang="yaml"> # /usr/local/etc/memory_alerts.yml groups: - name: available_memory rules: - alert: sustained_high_memory_usage expr: (node_memory_free_bytes / node_memory_size_bytes) > 0.9 for: 1h labels: severity: warning annotations: summary: "host {{ $labels.instance }} has sustained high memory usage" description: "{{ $labels.instance }} has had >=90% memory usage for at least an hour!"</syntaxhighlight>