Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to expose the metric timestamps from the Prometheus format #969

Open
draftcode opened this issue Oct 19, 2023 · 1 comment

Comments

@draftcode
Copy link
Contributor

draftcode commented Oct 19, 2023

Spin off from #967 and #847

Background

The Prometheus exposition format (https://github.com/prometheus/docs/blob/main/content/docs/instrumenting/exposition_formats.md#comments-help-text-and-type-information) allows exporters to specify an optional timestamp for each sample. If this is unset, the collector uses the timestamp that the sample is collected.

This timestamp is useful in a certain situation. My concrete use case is to expose GitHub API rate limit as a metric. This API rate limit is a value that decreases over time, and periodically it resets to the maximum value. We can get this API rate limit value as a HTTP response header when calling GitHub API.

Imagine a situation where there are two (physical) servers making this GitHub API call, and for the sake of illustration, let's assume that one server issues more requests and the other one does less.

  Server1 (frequently issue requests) Server2 (somehow less requests)
Observed API limit 100 4900
(Last time a server made a GH API call) 3 minutes ago 50 minutes ago

From human's point of view, because server1 made an API call more recently than server2, we can tell that server1's observed API limit is the current value. However, without exposing the sample timestamp, Prometheus cannot distinguish which one is the most recent value. This is because when Prometheus collects sample from these two servers, when there's no timestamp specified, it treats these samples as "the samples collected now".

By exposing the sample timestamp, Prometheus should be able to treat these two samples correctly, and it can tell that server1' s value is the most recent value.

Request

With #967, client_python should learn timestamps for Gauge metrics in the multiprocessing mode. This issue is asking for an option to expose those timestamps via Prometheus format.

Code pointers

Sample already can take a timestamp (

timestamp: Optional[Union[float, Timestamp]] = None
), and so as the Prometheus exposition formatter (
if line.timestamp is not None:
# Convert to milliseconds.
timestamp = f' {int(float(line.timestamp) * 1000):d}'
return f'{line.name}{labelstr} {floatToGoString(line.value)}{timestamp}\n'
).

For the multiprocessing mode, changing the metrics aggregation logic to propagate the timestamps (

metric.samples = [Sample(name_, dict(labels), value) for (name_, labels), value in samples.items()]
) would suffice (based on a flag or something).

For the non-multiprocessing mode, need to change this _child_samples (

def _child_samples(self) -> Iterable[Sample]:
return (Sample('', {}, self._value.get(), None, None),)
) to take a timestamp.

@baryluk
Copy link

baryluk commented Oct 21, 2023

I would vote no for this.

If you need sample timestamps, then implement Prometheus exposition yourself. It is very easy (I did this multiple times in Python with very little code).

Generic code and servers should never set timestamps, and as Prometheus documentation says this should not be supported in libraries by default.

The Gauge aggregation from multiple processes, is a different store. It is required there internally to have timestamp, to be able to find what was the most recent Set of the gauge. But this value is purely due to Python multi processing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants