promql: Fix limiting of extrapolation to negative values #13725

beorn7 · 2024-03-07T00:23:00Z

This is a bit tough to explain, but I'll try:

rate & friends have a sophisticated extrapolation algorithm. Usually, we extrapolate the result to the total interval specified in the range selector. However, if the first sample within the range is too far away from the beginning of the interval, or if the last sample within the range is too far away from the end of the interval, we assume the series has just started half a sampling interval before the first sample or after the last sample, respectively, and shorten the extrapolation interval correspondingly. We calculate the sampling interval by looking at the average time between samples within the range, and we define "too far away" as "more than 110% of that sampling interval".

However, if this algorithm leads to an extrapolated starting value that is negative, we limit the start of the extrapolation interval to the point where the extrapolated starting value is zero.

At least that was the intention.

What we actually implemented is the following: If extrapolating all the way to the beginning of the total interval would lead to an extrapolated negative value, we would only extrapolate to the zero point as above, even if the algorithm above would have selected a starting point that is just half a sampling interval before the first sample and that starting point would not have an extrapolated negative value. In other word: What was meant as a limitation of the extrapolation interval yielded a longer extrapolation interval in this case.

There is an exception to the case just described: If the increase of the extrapolation interval is more than 110% of the sampling interval, we suddenly drop back to only extrapolate to half a sampling interval.

This behavior can be nicely seen in the testcounter_zero_cutoff test, where the rate goes up all the way to 0.7 and then jumps back to 0.6.

This commit changes the behavior to what was (presumably) intended from the beginning: The extension of the extrapolation interval is only limited if actually needed to prevent extrapolation to negative values, but the "limitation" never leads to more extrapolation anymore.

The difference is subtle, and probably it never bothered anyone. However, if you calculate a rate of a classic histograms, the old behavior might create non-monotonic histograms as a result (because of the jumps you can see nicely in the old version of the testcounter_zero_cutoff test). With this fix, that doesn't happen anymore.

This is a bit tough to explain, but I'll try: `rate` & friends have a sophisticated extrapolation algorithm. Usually, we extrapolate the result to the total interval specified in the range selector. However, if the first sample within the range is too far away from the beginning of the interval, or if the last sample within the range is too far away from the end of the interval, we assume the series has just started half a sampling interval before the first sample or after the last sample, respectively, and shorten the extrapolation interval correspondingly. We calculate the sampling interval by looking at the average time between samples within the range, and we define "too far away" as "more than 110% of that sampling interval". However, if this algorithm leads to an extrapolated starting value that is negative, we limit the start of the extrapolation interval to the point where the extrapolated starting value is zero. At least that was the intention. What we actually implemented is the following: If extrapolating all the way to the beginning of the total interval would lead to an extrapolated negative value, we would only extrapolate to the zero point as above, even if the algorithm above would have selected a starting point that is just half a sampling interval before the first sample and that starting point would not have an extrapolated negative value. In other word: What was meant as a _limitation_ of the extrapolation interval yielded a _longer_ extrapolation interval in this case. There is an exception to the case just described: If the increase of the extrapolation interval is more than 110% of the sampling interval, we suddenly drop back to only extrapolate to half a sampling interval. This behavior can be nicely seen in the testcounter_zero_cutoff test, where the rate goes up all the way to 0.7 and then jumps back to 0.6. This commit changes the behavior to what was (presumably) intended from the beginning: The extension of the extrapolation interval is only limited if actually needed to prevent extrapolation to negative values, but the "limitation" never leads to _more_ extrapolation anymore. The difference is subtle, and probably it never bothered anyone. However, if you calculate a rate of a classic histograms, the old behavior might create non-monotonic histograms as a result (because of the jumps you can see nicely in the old version of the testcounter_zero_cutoff test). With this fix, that doesn't happen anymore. Signed-off-by: beorn7 <beorn@grafana.com>

beorn7 · 2024-03-07T00:23:18Z

@zenador here is the promised fix.

beorn7 · 2024-03-07T00:24:26Z

This should fix #13671.

juliusv · 2024-03-07T07:49:11Z

I had to read that multiple times before I think I understood it. So here's an illustration of the issue, I hope it's correct:

juliusv · 2024-03-07T07:52:37Z

👍

beorn7 · 2024-03-07T11:35:43Z

I had scribbled a graph on paper, but it was not legible enough to put a scan here. 😁

@juliusv your graph is almost correct. But the zero point is actually far enough from the first sample (more than "sampleInterval * 1.1") that it would fall back to "sampleInterval / 2". So the problem only happens if the zero point is closer than "sampleInterval * 1.1" but farther than "sampleInterval / 2".

juliusv · 2024-03-07T14:01:51Z

Ah, thanks for clarifying! Indeed so complicated to think about!

Includes fix to rate calculation prometheus/prometheus#13725 And some native histograms prometheus/prometheus#13681 Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Vendor update mimir-prometheus at 2b3d9c010123 Includes fix to rate calculation prometheus/prometheus#13725 And some native histograms prometheus/prometheus#13681 * Copy private const strings from Alertmanager. Note: these are again public on main in alertmanager. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

…#13725

* Use forked Prometheus module * Use promql.QueryEngine rather than *promql.Engine, add config option to select engine * Add parallel query path with streaming engine. * Add IntelliJ debugging configuration for streaming querier. * Add Grafana datasource for streaming querier * Use correct query-scheduler port and enable debugging * Set up load test. * Fix p99 latency panel * Commit settings used in previous test. * Enable chunks streaming. * Disable load generator. * Create separate load testing query paths to ensure meta-monitoring queries don't interfere with load test results. * Add initial k6-based load test script * Add script to run load tests against both engines. * Add todo * Extract friendly name to hostname logic to separate script * Rename k6 output file * Check both engines produce the same result. * Summarise results after each test run. * Enable streaming chunks from store-gateways to queriers. * Refactor test script, print max and average CPU and memory after test * Reorder variables to match output order * Don't include query name in k6 output file name. * Add initial script to generate data. * Rename script * Add script to generate data generation config. * Make it easier to test different queries. * Rename PromQL engine CLI flag. * Add Thanos' engine. * Add Thanos engine to benchmarking setup. * Set default test duration to 5 minutes. * Rework data generation setup to work around limitation in prometheus-toolbox. * Load test data faster. * Record test config for first test * Record test config for second test * Record test config for third test * Disable gRPC compression for ingester client * Record test config for re-run of second test * Record test config for re-run of first test * Record test config for re-run of third test * Log URL used for test. * Record test config for fourth test * Record test config for fifth test * Record test config for 10k series test * Record test config for 1k series test * Record test config for 50k series test * Fix configuration issue that does not affect outcome of testing. This does not affect the outcome of testing because all tests use queriers directly (ie. do not use query-frontends or query-schedulers) * Record test config for seventh test * Move production code to main Mimir code tree * Copy test files into repo * Add license and provenance comments * Add a readme. * Update package path in scripts and remove unnecessary testing script * Add more notes to code * Explain benefit of streaming engine. * Mention benchmarking tools in readme. * Remove references to Thanos' engine. * Undo unnecessary go.mod changes * Remove benchmarking tool for now We can add something more robust in the future if we need it. * Remove docker-compose changes * Reset whitespace changes * Remove readme * Vendor in branch of mimir-prometheus with changes from prometheus/prometheus#13713 * Fix issues introduced with recent Prometheus changes and fix linting issues * Detect and reject configuration not supported by streaming engine * Remove TODOs tracked elsewhere * Bring in latest changes to `QueryEngine` interface * Remove more TODOs tracked elsewhere * Change signature of `Next()` to return an error when the stream is exhausted * Ignore warnings when comparing results from Prometheus and streaming engines. * Correct naming * Rename `Operator` to clarify that it produces instant vectors. * Rename test file * Reorganise test file * Bring in latest benchmark changes from upstream. * Run tests and benchmarks with isolation disabled, to mirror what Mimir itself does * Fix script permissions * Simplify test setup and don't bother testing 2000 series case to speed up tests. * Use assertion helpers in benchmark * Improve engine benchmark comparison script: fix issue with package name containing "streaming", clearly label results * Remove unnecessary files * Add note to engine benchmark comparison script * Make test names clearer * Ensure both engines produce the same result before benchmarking * Add tests for unsupported expressions and improve error messages returned when unsupported expression used. * Fix linting warnings * Add notes for invalid cases that should be caught by the PromQL parser. * Implement `Query.Statement()` * Give variable a better name * Remove Pool interface * Remove unnecessary types * Update comments * Remove unused lookback delta for range vector selector * Extract selector logic to a shared type * Add more TODOs * Use Selector for range vector selectors. * Compute values used on every call to Next() once in instant and range vector selectors. This gives a 0-7% latency improvement for queries with many series at small absolute cost to queries with few series. Memory consumption is slightly higher (tens of bytes), but not concerning. * Remove unnecessary script and update comment * Validate query time range * Enforce that range query expression produces a vector or scalar result. * Add CLI flag to experimental features list in docs * Add changelog entry * Fix linting warnings * Move common PromQL engine configuration to its own package * Don't bother with 10 series benchmarks. They're not very interesting, and the benchmarks take too long to run. * Sort imports * Add missing license header. * Use bucketed pool based on zeropool. This greatly improves latency and allocations, with latency improvements between 0 and 15% on our benchmarks. * Fix import sorting * Set upper limits for sizes of pooled slices. * Set size for SeriesBatch pool * Add integration test. * Move RingBuffer out of util package and pool slices used. This improves latency of rate-related benchmarks by between 0 and 11%. * Add tests for ring buffer type. * Return RingBuffer point slices to pool when no longer required. * Fix linting * Remove TODOs about pooling group slices and maps. Pooling these slices did not have a substantial performance impact. * Pool groups in aggregation operator. This reduces latency of aggregation operations by 0-2% in our benchmarks. * Remove TODOs about pooling label builders. Pooling these had a negative performance impact. * Remove more TODOs, clarify interface expectations * Release memory sooner after query execution * Fix typo in comment * Remove unecessary context cancellation checks from operators. Context cancellation is handled by the underlying queryable, at least for the PromQL expressions we currently support. * Fix linting warnings * Address remaining TODOs in Selector * Fix integration test flakiness * Don't panic when encountering a native histogram * Fix issue where bucketed pool could return a slice with capacity less than the requested size. * Remove redundant checks * Add support for @ in instant and range vector selectors. This is used extensively by Prometheus' test scripts, so it's easier to just implement support for this rather than try to workaround it or disable it. * Run Prometheus' test cases against streaming engine. There are still some failing test cases, but these appear to be genuine issues that need to be investigated. * Fix failing staleness test: don't emit empty series for instant query * Upgrade mimir-prometheus * Add test case for stale markers in a range query. * Add test cases for @ modifier with range queries * Check for errors while reading test script * Fix whitespace * Remove duplicate timestamp computation in Selector and operators * Update `rate` calculation to match behaviour in prometheus/prometheus#13725 * Add an overview of the engine. * Fix linting issues and clarify docs. * Add test case for scenario where input to aggregation returns no results or no points, and remove stale TODOs * Implement `Query.String()`, and remove stale TODOs. * Clarify BucketedPool contract, and add test case for case where min size is greater than 1. * Expand test coverage to include scenarios not covered by upstream tests or those uniquely likely to cause issues in the streaming engine. * Elaborate on comment * Expand example in docs and fix typo * Remove unnecessary sorting and use non-experimental slices package in comparison tests. * Add missing comments. * Clean up temporary files in benchmark comparison script, and improve formatting of output * Add note explaining purpose of script. * Address PR feedback * Replace use of `DurationMilliseconds` with `d.Milliseconds()` * Rename `Series()` to `SeriesMetadata()` * Include names of acceptable values in CLI flag description * Remove TODOs * Add explanation for RangeVectorSelectorWithTransformation * Move to top-level `streamingpromql` package * Move common query engine config back to original location (revert e3933a2)

beorn7 requested a review from roidelapluie as a code owner March 7, 2024 00:23

beorn7 mentioned this pull request Mar 7, 2024

rate() function breaks histogram bucket monotonicity #13671

Closed

beorn7 merged commit b0c0961 into main Mar 7, 2024
40 checks passed

beorn7 deleted the beorn7/promql2 branch March 7, 2024 11:36

krajorama mentioned this pull request Mar 11, 2024

merge upstream at b4d4dcd9f6b5d76a72d71c4a5db0681538c3b070 grafana/mimir-prometheus#604

Merged

krajorama mentioned this pull request Mar 12, 2024

Vendor update mimir-prometheus at 2b3d9c010123 grafana/mimir#7603

Merged

2 tasks

GiedriusS mentioned this pull request Mar 27, 2024

upgrade Prometheus version to latest main thanos-io/promql-engine#435

Merged

charleskorn added a commit to grafana/mimir that referenced this pull request Apr 2, 2024

Update rate calculation to match behaviour in prometheus/prometheus…

1fd7ad5

…#13725

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

promql: Fix limiting of extrapolation to negative values #13725

promql: Fix limiting of extrapolation to negative values #13725

beorn7 commented Mar 7, 2024

beorn7 commented Mar 7, 2024

beorn7 commented Mar 7, 2024

juliusv commented Mar 7, 2024

juliusv commented Mar 7, 2024

beorn7 commented Mar 7, 2024

juliusv commented Mar 7, 2024

promql: Fix limiting of extrapolation to negative values #13725

promql: Fix limiting of extrapolation to negative values #13725

Conversation

beorn7 commented Mar 7, 2024

beorn7 commented Mar 7, 2024

beorn7 commented Mar 7, 2024

juliusv commented Mar 7, 2024

juliusv commented Mar 7, 2024

beorn7 commented Mar 7, 2024

juliusv commented Mar 7, 2024