Investigate if attribute filtering should be in the instrument or aggregator #3011

MrAlias · 2022-07-12T17:08:26Z

Investigate which is more optimal

Filter attributes at the instrument level when a "record" operation is called.
Filter attributes at the aggregator level when an Aggregation collection is called.

Ensure we implement the optimal one.

The text was updated successfully, but these errors were encountered:

jmacd · 2022-07-18T21:28:02Z

IMO filtering is better done in the aggregator, since the cost will be once per export interval instead of once per observation.

MrAlias · 2023-12-18T19:07:40Z

Initial benchmarking for the last-value aggregation (the simplistic one) shows the expected performance differences.

Benchmark:

func benchmarkFiltered[N int64 | float64](factory func(attribute.Filter) (Measure[N], ComputeAggregation)) func(*testing.B) {
	nAttr := []int{1, 10, 100}       // Number of distinct attribute sets.
	nMeas := []int{1, 10, 100, 1000} // Number of measurements made per attribute set.
	return func(b *testing.B) {
		for _, attributeCap := range nAttr {
			for _, measurements := range nMeas {
				name := fmt.Sprintf("Attributes/%d/Measurements/%d", attributeCap, measurements)
				b.Run(name, func(b *testing.B) {
					attrs := make([]attribute.Set, attributeCap)
					for i := range attrs {
						attrs[i] = attribute.NewSet(
							userAlice,
							attribute.Int("value", i),
						)
					}

					got := &bmarkRes
					ctx := context.Background()
					meas, comp := factory(attrFltr)

					b.ReportAllocs()
					b.ResetTimer()
					for n := 0; n < b.N; n++ {
						for m := 0; m < measurements; m++ {
							for _, attr := range attrs {
								meas(ctx, 1, attr)
							}
						}

						assert.Equal(b, 1, comp(got), "attributes not filtered")
					}
				})
			}
		}
	}
}

$ go test -run='^$' -bench=LastValue/Filtered/Int64 -count=10 > old.txt  # run on main
...
$ go test -run='^$' -bench=LastValue/Filtered/Int64 -count=10 > new.txt  # run on test branch
...
$ benchstat old.txt new.txt
goos: linux
goarch: amd64
pkg: go.opentelemetry.io/otel/sdk/metric/internal/aggregate
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
                                                            │   old.txt   │                new.txt                │
                                                            │   sec/op    │    sec/op     vs base                 │
LastValue/Filtered/Int64/Attributes/1/Measurements/1-8        1.154µ ± 3%    1.779µ ± 2%   +54.23% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/1/Measurements/10-8       4.776µ ± 3%    3.772µ ± 1%   -21.03% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/1/Measurements/100-8      40.46µ ± 4%    23.15µ ± 2%   -42.79% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/1/Measurements/1000-8     398.4µ ± 3%    218.9µ ± 6%   -45.06% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/1-8       4.836µ ± 2%   10.019µ ± 2%  +107.18% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/10-8      40.57µ ± 3%    30.68µ ± 1%   -24.39% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/100-8     406.0µ ± 3%    240.3µ ± 1%   -40.82% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/1000-8    4.088m ± 3%    2.349m ± 1%   -42.55% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/1-8      41.33µ ± 2%    87.16µ ± 1%  +110.90% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/10-8     411.0µ ± 3%    291.8µ ± 1%   -29.00% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/100-8    4.058m ± 3%    2.417m ± 1%   -40.42% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/1000-8   40.43m ± 2%    22.82m ± 6%   -43.56% (p=0.000 n=10)
geomean                                                       144.2µ         119.3µ        -17.28%

                                                            │     old.txt     │               new.txt                │
                                                            │      B/op       │     B/op      vs base                │
LastValue/Filtered/Int64/Attributes/1/Measurements/1-8             216.0 ± 0%     328.0 ± 0%  +51.85% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/1/Measurements/10-8           1944.0 ± 0%     327.0 ± 0%  -83.18% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/1/Measurements/100-8         19224.0 ± 0%     328.0 ± 0%  -98.29% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/1/Measurements/1000-8       192024.0 ± 0%     327.5 ± 0%  -99.83% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/1-8          1.898Ki ± 0%   2.057Ki ± 0%   +8.33% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/10-8        18.773Ki ± 0%   2.056Ki ± 0%  -89.05% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/100-8      187.523Ki ± 0%   2.055Ki ± 0%  -98.90% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/1000-8    1875.026Ki ± 0%   2.058Ki ± 1%  -99.89% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/1-8         18.77Ki ± 0%   18.93Ki ± 0%   +0.83% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/10-8       187.52Ki ± 0%   18.93Ki ± 0%  -89.90% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/100-8     1875.03Ki ± 0%   18.97Ki ± 0%  -98.99% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/1000-8   18750.06Ki ± 0%   19.43Ki ± 0%  -99.90% (p=0.000 n=10)
geomean                                                          60.02Ki        2.323Ki       -96.13%

                                                            │    old.txt    │              new.txt               │
                                                            │   allocs/op   │ allocs/op   vs base                │
LastValue/Filtered/Int64/Attributes/1/Measurements/1-8           3.000 ± 0%   4.000 ± 0%  +33.33% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/1/Measurements/10-8         21.000 ± 0%   4.000 ± 0%  -80.95% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/1/Measurements/100-8       201.000 ± 0%   4.000 ± 0%  -98.01% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/1/Measurements/1000-8     2001.000 ± 0%   4.000 ± 0%  -99.80% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/1-8          21.00 ± 0%   22.00 ± 0%   +4.76% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/10-8        201.00 ± 0%   22.00 ± 0%  -89.05% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/100-8      2001.00 ± 0%   22.00 ± 0%  -98.90% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/10/Measurements/1000-8    20001.00 ± 0%   22.00 ± 0%  -99.89% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/1-8         201.0 ± 0%   203.0 ± 0%   +1.00% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/10-8       2001.0 ± 0%   203.0 ± 0%  -89.86% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/100-8     20001.0 ± 0%   203.0 ± 0%  -98.99% (p=0.000 n=10)
LastValue/Filtered/Int64/Attributes/100/Measurements/1000-8   200001.0 ± 0%   203.0 ± 0%  -99.90% (p=0.000 n=10)
geomean                                                          660.4        26.14       -96.04%

In the trivial case, where there is one measurement per distinct attribute set, there is an decrease in CPU performance and an increase in memory use. This makes sense as the backing array for the aggregate will be the performance bottle-neck, not the filtering computation.

However, for more realistic workloads, where there are many measurements for distinct attribute sets, the CPU performance increased and memory use decreased. Importantly though, the allocation scaled by O(N) for N being the number of distinct attributes instead of O(N+M) for M being the number of measurements made in a collection. This allocation detail is not reflected in the CPU performance, but will be a major factor in real world scenarios based on the GC pressure that will be removed (in all but the trivial case).

This initial testing shows this change should continue to be pursued.

MrAlias · 2023-12-19T17:43:45Z

Cardinality limiting is going to complicate this. Limiting is currently done on the measurement of values. However, with this filtering being done on collection the current limiting will "over limit".

MrAlias · 2023-12-19T19:36:04Z

Cardinality limiting is going to complicate this. Limiting is currently done on the measurement of values. However, with this filtering being done on collection the current limiting will "over limit".

This interaction needs to be brought to the specification prior to the cardinality limit being stabilized.

MrAlias · 2023-12-19T21:43:00Z

Cardinality limiting is going to complicate this. Limiting is currently done on the measurement of values. However, with this filtering being done on collection the current limiting will "over limit".

This interaction needs to be brought to the specification prior to the cardinality limit being stabilized.

open-telemetry/opentelemetry-specification#3798

MrAlias · 2024-01-29T20:00:56Z

Moving out of the post-GA project. There is not clear consensus on how to resolve open-telemetry/opentelemetry-specification#3803.

Give the solution to this will require inconsistent attribute filter values in favor of performance there needs to be a strong user desire to see this before it warrants the developer commitment.

MrAlias added pkg:SDK Related to an SDK package area:metrics Part of OpenTelemetry Metrics labels Jul 12, 2022

MrAlias added this to the Metric SDK: Beta milestone Jul 12, 2022

MadVikingGod mentioned this issue Jul 25, 2022

Adds a filter Aggregator. #3040

Merged

MrAlias removed this from the Metric SDK: Beta milestone Oct 20, 2022

MrAlias self-assigned this Dec 15, 2023

MrAlias mentioned this issue Dec 18, 2023

Add exemplars to the metric SDK as an experimental feature #4455

Closed

MrAlias mentioned this issue Jan 8, 2024

Filter metric time series attributes instead of measurements #4816

Closed

MrAlias removed their assignment Jan 29, 2024

MrAlias added the blocked:design Waiting on design work to be completed before implementation can start. label Jan 29, 2024

pellared added the blocked:specification Waiting on clarification of the OpenTelemetry specification before progress can be made label Feb 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate if attribute filtering should be in the instrument or aggregator #3011

Investigate if attribute filtering should be in the instrument or aggregator #3011

MrAlias commented Jul 12, 2022

jmacd commented Jul 18, 2022

MrAlias commented Dec 18, 2023

MrAlias commented Dec 19, 2023

MrAlias commented Dec 19, 2023

MrAlias commented Dec 19, 2023

MrAlias commented Jan 29, 2024

Investigate if attribute filtering should be in the instrument or aggregator #3011

Investigate if attribute filtering should be in the instrument or aggregator #3011

Comments

MrAlias commented Jul 12, 2022

jmacd commented Jul 18, 2022

MrAlias commented Dec 18, 2023

MrAlias commented Dec 19, 2023

MrAlias commented Dec 19, 2023

MrAlias commented Dec 19, 2023

MrAlias commented Jan 29, 2024