Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Histogram without label creates histogram_*.db even without data #1022

Open
JinLisek opened this issue Apr 9, 2024 · 3 comments
Open

Histogram without label creates histogram_*.db even without data #1022

JinLisek opened this issue Apr 9, 2024 · 3 comments

Comments

@JinLisek
Copy link

JinLisek commented Apr 9, 2024

I tried this with versions: 0.19.0 and 0.20.0, both seem to have this bug.

I have a script prom_example.py:

from prometheus_client import Histogram

Example = Histogram("example", "Example", "lol")

When I run it:

prometheus_multiproc_dir=/tmp/metrics python prom_example.py

I see nothing in /tmp/metrics (as expected).

But when I edit the script and remove the label from histogram:

from prometheus_client import Histogram

Example = Histogram("example", "Example")

Database is created is /tmp/metrics, named for example: histogram_37373.db I would expect no database creation, since no data is pushed into the metric.

Edit: it seems the behaviour is the same for Counter and Gauge

@csmarchbanks
Copy link
Member

Thank you for opening this discussion!

I believe the current behavior is correct, specifically there should be a histogram with 0 values for all of the buckets/sum/count in the file. This is to allow a user to initialize the histogram at service startup before any requests are received which allows continuous 0 values in graphs instead of missing data. See https://prometheus.io/docs/practices/instrumentation/#avoid-missing-metrics for more information.

@JinLisek
Copy link
Author

Hmm... It's still weird to me why the behaviour is different when there are labels vs no labels.

My main issue is with this:
Cron job runs every minute, it imports (indirectly) the metric, without using it. The disk is then flooded with empty databases...
After adding a label the problem disappears.

But I can imagine, someone in the future (not knowing this tricky behaviour) creating a new metric without a label and the problem comes back. It's difficult to keep it from happening this way.

I don't see a good solution though. :(

@csmarchbanks
Copy link
Member

When no labels are specified the client already knows to create the metric without someone needing to call .labels() on it so it automatically initializes it to zero to avoid the missing metrics issue.

From a Cron job do you need to be exporting metrics via multiprocess mode? It might make more sense to use something like the pushgateway and then have the metrics in memory only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants