-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge sparsehistogram branch into main #1150
Conversation
Printf the structure of it instead of actually encoding it. Signed-off-by: beorn7 <beorn@grafana.com>
Signed-off-by: beorn7 <beorn@grafana.com>
Signed-off-by: beorn7 <beorn@grafana.com>
Signed-off-by: beorn7 <beorn@grafana.com>
Signed-off-by: beorn7 <beorn@grafana.com>
(now using sparsehistogram branch) Signed-off-by: beorn7 <beorn@grafana.com>
Signed-off-by: beorn7 <beorn@grafana.com>
Signed-off-by: beorn7 <beorn@grafana.com>
This seem what OTel is converging towards, see open-telemetry/oteps#149 . I see pros and cons with base-10 vs base-2. They are discussed in detail in that OTel PR, and the gist of the discussion is pretty much in line with my design doc. Since the balance is easy to tip here, I think we should go with base-2 if OTel picks base-2. This also seems to be in agreement with several proprietary solution (see again the discussion on that OTel PR.) The idea to make the number of buckets per power of 2 (or formerly 10) a power of 2 itself was also sketched out in the design doc already. It guarantees mergeability of different resolutions. I was undecided between making it a recommendation or mandatory. Now I think it should be mandatory as it has the additional benefit of playing well with OTel's plans. This commit also addresses a number of outstanding TODOs. Signed-off-by: beorn7 <beorn@grafana.com>
Signed-off-by: beorn7 <beorn@grafana.com>
Signed-off-by: beorn7 <beorn@grafana.com>
Switch sparse histograms to base-2 buckets
Signed-off-by: beorn7 <beorn@grafana.com>
Signed-off-by: beorn7 <beorn@grafana.com>
Signed-off-by: beorn7 <beorn@grafana.com>
Signed-off-by: beorn7 <beorn@grafana.com>
Implement strategy to limit the sparse bucket count
Signed-off-by: beorn7 <beorn@grafana.com>
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
* Histogram: Expose bug with negative schema Signed-off-by: beorn7 <beorn@grafana.com> * Histogram: Fix bug with negative schemas Signed-off-by: beorn7 <beorn@grafana.com>
Note that this is an incompatible change. To scrape this new format, the Prometheus server needs to be updated at the same time. PR incoming. Signed-off-by: beorn7 <beorn@grafana.com>
histograms: Move to new exposition protobuf format
NaN observations now go to no bucket, but increment count (and effectively set sum to NaN, too). ±Inf observations now go to the bucket following the bucket that would have received math.MaxFloat64. The former is now the last bucket that can be created. The getLe is modified to return math.MaxFloat64 for the penultimate possible bucket. Also add a test for getLe. Signed-off-by: beorn7 <beorn@grafana.com>
sparse buckets: Fix handling of +Inf/-Inf/NaN observations
Native histograms are now in a tagged version (v0.3.0). Signed-off-by: beorn7 <beorn@grafana.com>
Signed-off-by: beorn7 <beorn@grafana.com>
Signed-off-by: beorn7 <beorn@grafana.com>
@codesome pinged me about this during KubeCon. I would love to look it through and get it merged this week or so. I can agree with |
I am also in favor of the term "native histogram". The problem is that the term "native bucket" doesn't make sense. |
For same names, "Native Histograms" without "Buckets" makes sense. I'll add another commit, so you can see the effect. |
40a70dc
to
e91b762
Compare
Added a new commit. Please check it out and let me know how you like it. |
e91b762
to
1bad934
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing work! I carefully reviewed public APIs and a bit less detailed review for implementation - I think it all makes sense. LGTM, happy with this, modulo naming confusion -> sparse vs native vs exponential histograms. Wonder what we can do here.
@RichiH also mentioned this problem in OM talk - and ppl from community were confused as well. Let's try to pick one name ;p
Ok - so "sparse buckets" for "native histograms" is something I can see as a good naming 👍🏽 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After you explanation I think it makes sense - happy with naming 👍🏽
Just commented code then.
Addressed all comments. Will now solve the merge conflict and then merge. 🎉 |
This intends to avoid confusing users by the subtle difference between a native histogram and a sparse bucket. Signed-off-by: beorn7 <beorn@grafana.com>
1bad934
to
fffb76c
Compare
We plan to soon merge prometheus/prometheus#11447, which will add Native Histogram support to the Prometheus server (behind a feature flag).
To use Native Histograms, instrumentation libraries need to support them. While client_golang doesn't have a notion of feature flags, I think it is the most convenient way to merge the experimental features into main anyway, but clearly mark the respective options as experimental. Note that the changes are fairly transparent to the user. Nothing changes as long as they do not touch any
SparseBuckets...
options inHistogramOpts
. And most of the time, they only have to setSparseBucketsFactor
to activate the feature. (In production, setting aSparseBucketsMaxNumber
is highly recommended, though. See doc comment.)IMPORTANT REQUEST: This has been a long-lived feature branch, from which presumably other developers have branched off again. I beg you to properly merge this branch and not squash or rebase it, so that the commits stay compatible with those in other branches. We tried to keep the commits meaningful and not too spammy.
As a final note a bit of reasoning why I kept the nomenclature of "sparse buckets" here: On the Prometheus side, "sparse histograms" turned out to be a misleading word. While the native histograms indeed have sparse buckets, that's only describing one of their many features. It was better to use a name that describes the underlying property that unlocked all those features, namely that histograms aren't emulated anymore by a number of separate float series (for each bucket, count, and sum) but now histograms are 1st class citizens, where the sample of a single series can be a fully fledged (on "native") histogram. client_golang stayed faithful to the original data model of the Prometheus exposition format, which (not by coincidence) always treated histograms as first class citizens. Calling the new histograms "native" is confusing on the client_golang side, because they were indeed always native. That's also the reason why we only need to add a few more fields and "new style" buckets to the existing protobuf exposition format. For the nomenclature, we essentially only need a new word for the "new style" buckets. "Native buckets" doesn't really nail it. "Native histogram buckets" would be correct (as in: those new-style buckets that enable Prometheus to ingest the histogram natively), but it's too long. "Exponential buckets" and "high-res buckets" are not really expressing the important difference because you could configure the conventional buckets to be exponential or high-res already. "Dynamic buckets" and "sparse buckets" are the terms that remain, and I simply opted for the term we are already familiar with.