Push Gateway

Push gateway

Why not use it?

Prometheus recommends using Pushgateway only in limited cases. There are several pitfalls when mindlessly using Pushgateway instead of Prometheus’s standard pull model:

Main problems:

Single Point of Failure: Monitoring multiple instances through a single Pushgateway makes it a single point of failure and potential bottleneck
Loss of automatic health monitoring: You lose automatic instance health monitoring through the up metric (generated at each scrape)
Metric lifecycle problem: Pushgateway never forgets series pushed to it and will expose them to Prometheus forever unless manually deleted via API

Particularly problematic:

When multiple job instances differentiate their metrics in Pushgateway through an instance label or similar, metrics for instances will remain in Pushgateway even if the original instance is renamed or deleted. This is because the lifecycle of Pushgateway as a metric cache is fundamentally separated from the lifecycle of processes that push metrics to it.

When to use

The only usually justified use case is capturing results of service-level batch jobs - batch jobs at the service level that are not semantically related to a specific machine or job instance (e.g., a batch job deleting users for the entire service).

Alternatives to push gateway

Firewall/NAT problem:

If an incoming firewall or NAT prevents scraping metrics from targets, consider:

Moving the Prometheus server behind the network barrier
Running Prometheus servers in the same network as monitored instances
Using PushProx, which allows Prometheus to traverse firewall or NAT

For batch jobs related to a machine (e.g., automatic security updates, running configuration management clients):

Use Node Exporter’s textfile collector instead of Pushgateway
This ensures proper lifecycle for metrics associated with a specific machine

How to use correctly

Set honor_labels: true in Prometheus!!! Otherwise Prometheus will take metrics from the Push Gateway scrape, not those it reports.
Adding: PUT HTTP to url http://pushgateway.example.org:9091/metrics/job/some_job/instance/some_instance
- some_job - job name. This name will be overwritten during scraping unless honor_labels: true is enabled
- some_instance - instance name
Deletion (because metric lives forever in gateway) DELETE:
- For instance http://pushgateway.example.org:9091/metrics/job/some_job/instance/some_instance
- For job: http://pushgateway.example.org:9091/metrics/job/some_job
Remember that push gateway exposes all metrics together. So there can be no conflicts.

Timestamp

Timestamp of metric sent to push gateway ≠ timestamp of metric in Prometheus. It will use the one from scrape.
Gateway adds push_time_seconds and push_failure_time_seconds to metrics

Encoding

Problem 1:

We want to set labels job="directory_cleaner",path="/var/tmp". /var/tmp won’t work because:

/metrics/job/directory_cleaner/path//var/tmp

will be treated as an empty value. So:

/metrics/job/directory_cleaner/path@base64/L3Zhci90bXA

With curl:

echo 'some_metric{foo="bar"} 3.14' | curl --data-binary @- http://pushgateway.example.org:9091/metrics/job/directory_cleaner/path@base64/$(echo -n '/var/tmp' | base64url)

Problem 2

We want to set 2 labels:

job="example",first_label="",second_label="foobar"

Which is:

 /metrics/job/example/first_label//second_label/foobar

Won’t work for the same reason as above. Must use = as connector:

/metrics/job/example/first_label@base64/=/second_label/foobar

Problem 3

Label:

job="titan",name="Προμηθεύς"

Can be:

/metrics/job/titan/name/%CE%A0%CF%81%CE%BF%CE%BC%CE%B7%CE%B8%CE%B5%CF%8D%CF%82

/metrics/job/titan/name@base64/zqDPgc6_zrzOt864zrXPjc-C

Problem 4 (UTF)

Flag --push.enable-utf8-names is required
Label name must be prefixed with U__
Special characters must be surrounded by _. So _1F60A_
Existing _ must have an additional _. So _ becomes __
From the above, if we have encoded with _ (e.g., _55_ ) we get U___55_____

Methods/API

PUT

pushing a group of metrics
format is either protobuf or text
Responses:
- 200 - success
- 400 - bad request, metric conflict. Reason is returned in response.
- 202 - returned only if flag --push.disable-consistency-check is set.
  - Then metrics are not checked on push, but scrape will fail.
It may happen that Gateway has inconsistent metrics. Then it will start rejecting other requests.
Push Gateway is not persistent.
PUT with empty body deletes entire metric group (group is defined by url).

POST

works the same as PUT, but only metrics with the same name are replaced
- So POST with metric value push_time_seconds will only update that value. Others will remain unchanged.

DELETE

Deletes metrics from group
Request body is empty
Response is always 202.
Deletion doesn’t happen immediately (PUSH and PUT are executed immediately). It’s queued
- So there’s no guarantee it will succeed,

Admin Api

Disabled by default
Enable through --web.enable-admin-api
URL: /api/<API_VERSION>/admin/<HANDLER>
E.g., to delete all metrics: curl -X PUT http://pushgateway.example.org:9091/api/v1/admin/wipe

Query API

URL: /api/<API_VERSION>/<HANDLER>
Methods:
- status - gateway info
- metrics - metrics

Management API

Methods
- GET /-/healthy - Returns code 200 when Pushgateway is healthy.
- GET /-/ready - Returns code 200 when Pushgateway is ready to handle traffic.
- Disabled by default and can be enabled with flag –web.enable-lifecycle
- PUT /-/quit - Triggers graceful shutdown of Pushgateway.

Push Gateway

Push Gateway

Push gateway

Why not use it?

Main problems:

Particularly problematic:

When to use

Alternatives to push gateway

Firewall/NAT problem:

How to use correctly

Timestamp

Encoding

Problem 1:

Problem 2

Problem 3

Problem 4 (UTF)

Methods/API

PUT

POST

DELETE

Admin Api

Query API

Management API

results matching ""

No results matching ""

Push Gateway

Push gateway

Why not use it?

Main problems:

Particularly problematic:

When to use

Alternatives to push gateway

Firewall/NAT problem:

Machine-related batch jobs:

How to use correctly

Timestamp

Encoding

Problem 1:

Problem 2

Problem 3

Problem 4 (UTF)

Methods/API

PUT

POST

DELETE

Admin Api

Query API

Management API

results matching ""

No results matching ""