Prometheus rules: Difference between revisions

From wikinotes
No edit summary
No edit summary
Line 38: Line 38:
= Alert Rules =
= Alert Rules =
<blockquote>
<blockquote>
<syntaxhighlight lang="yaml">
<syntaxhighlight lang="yaml">
# /usr/local/etc/memory_alerts.yml


groups:
  - name: available_memory
    rules:
    - alert: sustained_high_memory_usage
      expr:  (node_memory_free_bytes / node_memory_size_bytes) > 0.9
      for: 1h
      labels:
        severity: warning
      annotations:
        summary: "host {{ $labels.instance }} has sustained high memory usage"
        description: "{{ $labels.instance }} has had >=90% memory usage for at least an hour!"
</syntaxhighlight>
</syntaxhighlight>
</blockquote><!-- Alert Rules -->
</blockquote><!-- Alert Rules -->

Revision as of 02:43, 18 February 2022

Rules can be used to create new metrics, or to trigger alerts.

Documentation

official docs https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/

Recording Rules (custom metrics)

# /usr/local/etc/memory_rules.yml

groups:
  - name: available_memory
    rules:
    - record: node_available_memory_percent                    # metric rule-result is exposed as
      expr:   node_memory_free_bytes / node_memory_size_bytes  # PromQL query rule performs
# /usr/local/etc/prometheus.yml

rule_files:
  - "memory_rules.yml"
service prometheus restart

now node_available_memory_percent metric should be cached and queryable.

Alert Rules

<syntaxhighlight lang="yaml">
# /usr/local/etc/memory_alerts.yml

groups:
  - name: available_memory
    rules:
    - alert: sustained_high_memory_usage
      expr:  (node_memory_free_bytes / node_memory_size_bytes) > 0.9
      for: 1h
      labels:
        severity: warning
      annotations:
        summary: "host {{ $labels.instance }} has sustained high memory usage"
        description: "{{ $labels.instance }} has had >=90% memory usage for at least an hour!"

</syntaxhighlight>