Prometheus rules
From wikinotes
Rules can be used to create new metrics, or to trigger alerts.
Documentation
official docs https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/
Syntax
Recording Rules (custom metrics) ===
# /usr/local/etc/memory_rules.yml groups: - name: available_memory rules: - record: node_available_memory_percent # metric rule-result is exposed as expr: node_memory_free_bytes / node_memory_size_bytes # PromQL query rule performs# /usr/local/etc/prometheus.yml rule_files: - "alertmanager/rules/memory_rules.yml" # relative-path from prometheus.ymlservice prometheus restartnow
node_available_memory_percent
metric should be cached and queryable.Alert Rules
# /usr/local/etc/memory_alerts.yml groups: - name: available_memory rules: - alert: sustained_high_memory_usage expr: (node_memory_free_bytes / node_memory_size_bytes) > 0.9 for: 1h labels: severity: warning annotations: summary: "host {{ $labels.instance }} has sustained high memory usage" description: "{{ $labels.instance }} has had >=90% memory usage for at least an hour!"
Best Practices
- rule-files can include multiple rules with increasing severity
- rule-files can use templating, so rules can be applied to multiple metrics at once.
- rule-files should pertain to a single type of test