Prometheus promql: Difference between revisions
(18 intermediate revisions by the same user not shown) | |||
Line 13: | Line 13: | ||
|- | |- | ||
| official examples || https://prometheus.io/docs/prometheus/latest/querying/examples/ | | official examples || https://prometheus.io/docs/prometheus/latest/querying/examples/ | ||
|- | |||
| builtin functions || https://prometheus.io/docs/prometheus/latest/querying/functions/ | |||
|- | |- | ||
| re2 (regex engine) || https://github.com/google/re2/wiki/Syntax | | re2 (regex engine) || https://github.com/google/re2/wiki/Syntax | ||
|} | |} | ||
</blockquote><!-- Documentation --> | </blockquote><!-- Documentation --> | ||
= Comments = | = Comments = | ||
<blockquote> | <blockquote> | ||
<syntaxhighlight lang=" | <syntaxhighlight lang="promql"> | ||
# a comment | # a comment | ||
</syntaxhighlight> | </syntaxhighlight> | ||
Line 30: | Line 31: | ||
== Strings == | == Strings == | ||
<blockquote> | <blockquote> | ||
<syntaxhighlight lang=" | <syntaxhighlight lang="promql"> | ||
# string | # string | ||
"foo" | "foo" | ||
Line 42: | Line 43: | ||
== Floats == | == Floats == | ||
<blockquote> | <blockquote> | ||
<syntaxhighlight lang=" | <syntaxhighlight lang="promql"> | ||
23 | 23 | ||
-2.43 | -2.43 | ||
Line 53: | Line 54: | ||
</blockquote><!-- Datatypes --> | </blockquote><!-- Datatypes --> | ||
= | = Metric Datatypes = | ||
<blockquote> | |||
<syntaxhighlight lang="promql"> | |||
# scalar: (floats) | |||
23 | |||
2.43 | |||
# instant-vector: (still floats, but with the capacity to also carry metric labels) | |||
my_metric 1.2 | |||
my_metric{hostname="foobar", user="baz"} 1.2 | |||
# range-vectors: (arrays of instant-vectors -- a window of metrics) | |||
my_metric[10m] | |||
</syntaxhighlight> | |||
</blockquote><!-- Metric Datatypes --> | |||
= Metric-Selectors = | |||
<blockquote> | <blockquote> | ||
== | == Basics == | ||
<blockquote> | <blockquote> | ||
The most basic query you can use is <code>your_metric_name</code> which queries all samples for that metric.<br> | The most basic query you can use is <code>your_metric_name</code> which queries all samples for that metric.<br> | ||
These metric selectors can be composed and filtered. | These metric selectors can be composed and filtered. | ||
<syntaxhighlight lang=" | <syntaxhighlight lang="promql"> | ||
{__name__="your_metric_name"} # query all (you can match multiple metrics this way) | {__name__="your_metric_name"} # query all metrics (you can match multiple metrics this way) | ||
sum by(device) ({device =~ ".+"}) # query all distinct tag values for a tag (here, 'device' is tag) | |||
your_metric_name # query all | your_metric_name # query all | ||
Line 72: | Line 90: | ||
<blockquote> | <blockquote> | ||
label metrics support various matchers/operators | label metrics support various matchers/operators | ||
<syntaxhighlight lang=" | <syntaxhighlight lang="promql"> | ||
= # equal | = # equal | ||
!= # not-equal | != # not-equal | ||
Line 82: | Line 100: | ||
== Clustering Metrics == | == Clustering Metrics == | ||
<blockquote> | <blockquote> | ||
<syntaxhighlight lang="promql"> | |||
<syntaxhighlight lang=" | |||
your_metric_name[5min] # lump data into 5min clumps | your_metric_name[5min] # lump data into 5min clumps | ||
</syntaxhighlight> | </syntaxhighlight> | ||
Units | Units | ||
<syntaxhighlight lang=" | <syntaxhighlight lang="promql"> | ||
ms # milliseconds | ms # milliseconds | ||
s # seconds | s # seconds | ||
Line 103: | Line 118: | ||
== Query Time Ranges == | == Query Time Ranges == | ||
<blockquote> | <blockquote> | ||
<syntaxhighlight lang=" | <syntaxhighlight lang="promql"> | ||
your_metric_name offset 5min # query 5min-ago until present | your_metric_name offset 5min # query 5min-ago until present | ||
Line 109: | Line 124: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
</blockquote><!-- Offset Time Ranges --> | </blockquote><!-- Offset Time Ranges --> | ||
</blockquote><!-- Metric Selectors --> | |||
= Queries = | |||
<blockquote> | |||
== Basics == | |||
<blockquote> | |||
A simple query is simply a metric-selector with an optional filter | |||
<syntaxhighlight lang="promql"> | |||
your_metric_name{job="foo"} | |||
</syntaxhighlight> | |||
</blockquote><!-- Basics --> | |||
== SubQueries == | |||
<blockquote> | |||
You can combine functions and metrics.<br> | |||
From official examples: | |||
<syntaxhighlight lang="promql"> | |||
rate(http_requests_total[5m])[30m:1m] | |||
</syntaxhighlight> | |||
</blockquote><!-- SubQueries --> | |||
== Metric Operators == | |||
<blockquote> | |||
You can do simple math using metrics.<br> | |||
There is some type-trickiness here, see [https://prometheus.io/docs/prometheus/latest/querying/operators/ docs] for details. | |||
<syntaxhighlight lang="promql"> | |||
metric_start - metric_end | |||
</syntaxhighlight> | |||
math | |||
<syntaxhighlight lang="promql"> | |||
+ # add | |||
- # subtract | |||
* # multiplication | |||
/ # division | |||
% # modulo | |||
^ # exponent | |||
</syntaxhighlight> | |||
</blockquote><!-- Operators --> | |||
== Metric Matching == | |||
<blockquote> | |||
You can join metrics using labels. | |||
<syntaxhighlight lang="promql"> | |||
metric_1 and metric_2 # only elements of metric_1 with exactly matching label-sets in metric_2 | |||
metric_1 or metric_2 # all elements of metric_1, and elements of metric_2 with non-matching labels | |||
metric_1 unless metric_2 # only elements of metric_1, where there are no matching label-sets in metric_2 | |||
</syntaxhighlight> | |||
You can also filter which labels you want to match for an operation. | |||
<syntaxhighlight lang="promql"> | |||
on(label, label, ...) # when joining metrics, only look at these labels | |||
ignoring(label, label, ...) # when joining metrics, look at all labels except these | |||
# you can use these for any operator | |||
metric_1 * on(my_label) metric_2 # multiply metric_1 and metric_2, where metric_1/2's 'my_label' value is the same. | |||
</syntaxhighlight> | |||
Dividing two metrics with identical label sets produces error: <code>vector cannot contain metrics with the same labelset</code><br> | |||
avoid it like this | |||
<syntaxhighlight lang="promql"> | |||
sum by (city) (rate (births{}[2h])) | |||
/ | |||
sum by (city) (rate (deceased{}[2h])) | |||
</syntaxhighlight> | |||
'''Examples''' | |||
<blockquote> | |||
return filesystem usage,<br> | |||
for servers 'my-server' and 'my-other-server' only. | |||
<syntaxhighlight lang="promql"> | |||
(node_filesystem_avail_bytes / node_filesystem_size_bytes) > 0.75 # calculate used bytes | |||
and on(instance) # inner-join where instance matches | |||
node_uname_info{nodename=~"my-server|my-other-server"} # metric, scoped only to my-server, my-other-server (used to filter instances) | |||
</syntaxhighlight> | |||
You can also re-use this pattern of filtering with much larger queries. | |||
<syntaxhighlight lang="promql"> | |||
# where filesystem-usage is > 75% | |||
(node_filesystem_avail_bytes / node_filesystem_size_bytes) > 0.75 | |||
# join where both instance/mountpoint match | |||
and on(instance, mountpoint) | |||
( | |||
# where any of: | |||
# mountpoint in (/var/log, /var/audit) and nodename="server1" | |||
# mountpoint in (/usr/ports, /) and nodename="server2" | |||
# mountpoint = /var/mail and nodename="server3" | |||
# | |||
(node_filesystem_avail_bytes{mountpoint=~"/var/log|/var/audit"} | |||
and on(instance) node_uname_info{nodename="server1"}) | |||
or (node_filesystem_avail_bytes{mountpoint=~"/usr/ports|/"} | |||
and on(instance) node_uname_info{nodename="server2"}) | |||
or (node_filesystem_avail_bytes{mountpoint="/var/mail"} | |||
and on(instance) node_uname_info{nodename="server3"}) | |||
) | |||
</syntaxhighlight> | |||
</blockquote> | |||
</blockquote><!-- Matching --> | |||
== Aggregates == | |||
<blockquote> | |||
<syntaxhighlight lang="promql"> | |||
sum(your_metric_name) # aggregate function | |||
sum without (duration) (your_metric_name) # excludes 'duration' labels from sum | |||
sum by (job, duration) (your_metric_name) # group sums by label 'job' and 'duration' | |||
</syntaxhighlight> | |||
if you need a fixed count over a time range, you can use | |||
<source lang="promql"> | |||
sum(increase(some_metric[10m])) # | |||
</source> | |||
<syntaxhighlight lang="promql"> | |||
# simple | |||
sum # sum elements | |||
min # smallest of elements | |||
max # largest of elements | |||
avg # average of elements | |||
count # num of elements | |||
count_values # num elements with same value | |||
# complex | |||
group | |||
stddev | |||
stdvar | |||
bottomk | |||
topk | |||
quantile | |||
</syntaxhighlight> | |||
</blockquote><!-- Aggregates --> | |||
== Functions == | |||
<blockquote> | |||
There are several [https://prometheus.io/docs/prometheus/latest/querying/functions/ builtin functions]. | |||
<syntaxhighlight lang="promql"> | |||
# num LBAs read is very high, looks like flat line. | |||
# measuring the rate of change shows spikes in usage. | |||
rate(node_smartctl_total_lbas_read_raw[10m]) | |||
# build new label on metric from value | |||
label_replace(some_metric{}, "new_label", "$1", "old_label", "^(regex-capture)-on-old-value") | |||
# label-replace for range-vectors | |||
label_replace(some_metric{}, "new_label", "$1", "old_label", "^(regex-capture)-on-old-value")[$__range:] # note ':' at end for range | |||
# label-replace for empty vectors (ex. 'foo{} or on() vector(0)' to sub in a 0 value) | |||
label_replace(vector(0), "my_new_label", "my_value", "", "") | |||
</syntaxhighlight> | |||
</blockquote><!-- Functions --> | |||
</blockquote><!-- Queries --> | </blockquote><!-- Queries --> |
Latest revision as of 17:11, 30 November 2023
PromQL is prometheus's query language.
It's syntax is inspired by golang.
You can query prometheus from
- HTTP API
- UI table/graph view
Documentation
official docs https://prometheus.io/docs/prometheus/latest/querying/basics/ official examples https://prometheus.io/docs/prometheus/latest/querying/examples/ builtin functions https://prometheus.io/docs/prometheus/latest/querying/functions/ re2 (regex engine) https://github.com/google/re2/wiki/Syntax
Comments
# a comment
Datatypes
Strings
# string "foo" `foo` # string-literal 'foo\nbar'Floats
23 -2.43 3.4e-9 0x8f -Inf NaN
Metric Datatypes
# scalar: (floats) 23 2.43 # instant-vector: (still floats, but with the capacity to also carry metric labels) my_metric 1.2 my_metric{hostname="foobar", user="baz"} 1.2 # range-vectors: (arrays of instant-vectors -- a window of metrics) my_metric[10m]
Metric-Selectors
Basics
The most basic query you can use is
your_metric_name
which queries all samples for that metric.
These metric selectors can be composed and filtered.{__name__="your_metric_name"} # query all metrics (you can match multiple metrics this way) sum by(device) ({device =~ ".+"}) # query all distinct tag values for a tag (here, 'device' is tag) your_metric_name # query all your_metric_name[5min] # lump data into 5min clumps your_metric_name{job="foo",group="bar"} # filter by metric-labelsOperators
label metrics support various matchers/operators
= # equal != # not-equal =~ # regex match !~ # not regex matchClustering Metrics
your_metric_name[5min] # lump data into 5min clumpsUnits
ms # milliseconds s # seconds m # minutes h # hours d # days w # weeks y # yearsQuery Time Ranges
your_metric_name offset 5min # query 5min-ago until present your_metric_name @ 1609746000 # query at exactly '2021-01-04T07:40:00+00:00'
Queries
Basics
A simple query is simply a metric-selector with an optional filter
your_metric_name{job="foo"}SubQueries
You can combine functions and metrics.
From official examples:rate(http_requests_total[5m])[30m:1m]Metric Operators
You can do simple math using metrics.
There is some type-trickiness here, see docs for details.metric_start - metric_endmath
+ # add - # subtract * # multiplication / # division % # modulo ^ # exponentMetric Matching
You can join metrics using labels.
metric_1 and metric_2 # only elements of metric_1 with exactly matching label-sets in metric_2 metric_1 or metric_2 # all elements of metric_1, and elements of metric_2 with non-matching labels metric_1 unless metric_2 # only elements of metric_1, where there are no matching label-sets in metric_2You can also filter which labels you want to match for an operation.
on(label, label, ...) # when joining metrics, only look at these labels ignoring(label, label, ...) # when joining metrics, look at all labels except these # you can use these for any operator metric_1 * on(my_label) metric_2 # multiply metric_1 and metric_2, where metric_1/2's 'my_label' value is the same.Dividing two metrics with identical label sets produces error:
vector cannot contain metrics with the same labelset
avoid it like thissum by (city) (rate (births{}[2h])) / sum by (city) (rate (deceased{}[2h]))Examples
return filesystem usage,
for servers 'my-server' and 'my-other-server' only.(node_filesystem_avail_bytes / node_filesystem_size_bytes) > 0.75 # calculate used bytes and on(instance) # inner-join where instance matches node_uname_info{nodename=~"my-server|my-other-server"} # metric, scoped only to my-server, my-other-server (used to filter instances)You can also re-use this pattern of filtering with much larger queries.
# where filesystem-usage is > 75% (node_filesystem_avail_bytes / node_filesystem_size_bytes) > 0.75 # join where both instance/mountpoint match and on(instance, mountpoint) ( # where any of: # mountpoint in (/var/log, /var/audit) and nodename="server1" # mountpoint in (/usr/ports, /) and nodename="server2" # mountpoint = /var/mail and nodename="server3" # (node_filesystem_avail_bytes{mountpoint=~"/var/log|/var/audit"} and on(instance) node_uname_info{nodename="server1"}) or (node_filesystem_avail_bytes{mountpoint=~"/usr/ports|/"} and on(instance) node_uname_info{nodename="server2"}) or (node_filesystem_avail_bytes{mountpoint="/var/mail"} and on(instance) node_uname_info{nodename="server3"}) )Aggregates
sum(your_metric_name) # aggregate function sum without (duration) (your_metric_name) # excludes 'duration' labels from sum sum by (job, duration) (your_metric_name) # group sums by label 'job' and 'duration'if you need a fixed count over a time range, you can use
sum(increase(some_metric[10m])) ## simple sum # sum elements min # smallest of elements max # largest of elements avg # average of elements count # num of elements count_values # num elements with same value # complex group stddev stdvar bottomk topk quantileFunctions
There are several builtin functions.
# num LBAs read is very high, looks like flat line. # measuring the rate of change shows spikes in usage. rate(node_smartctl_total_lbas_read_raw[10m]) # build new label on metric from value label_replace(some_metric{}, "new_label", "$1", "old_label", "^(regex-capture)-on-old-value") # label-replace for range-vectors label_replace(some_metric{}, "new_label", "$1", "old_label", "^(regex-capture)-on-old-value")[$__range:] # note ':' at end for range # label-replace for empty vectors (ex. 'foo{} or on() vector(0)' to sub in a 0 value) label_replace(vector(0), "my_new_label", "my_value", "", "")