Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http.server.duration metric includes high cardinality attr: http.client_ip #3765

Closed
vmihailenco opened this issue May 1, 2023 · 16 comments
Closed

Comments

@vmihailenco
Copy link
Contributor

vmihailenco commented May 1, 2023

Meaning that there are as many timeseries as you have visitors. Other http.server.* metrics have the same problem.

Those metrics also include such attrs as http.user_agent, net.sock.peer.addr, net.sock.peer.port, which should also be removed.

@dmathieu
Copy link
Member

dmathieu commented May 3, 2023

We should split the httpconv methods to handle traces and metrics differently. Metrics need low cardinality, while the entire point of traces is to have high cardinality.
If we just remove the attributes, that will cause trace issues such as #3743.

@rangzen
Copy link

rangzen commented May 3, 2023

The correct attributes already exist, cf. #3182, and the last comment by @mlaflamm.

Before this commit, attributes were:

attributes := append(labeler.Get(), semconv.HTTPServerMetricAttributesFromHTTPRequest(h.operation, r)...)		

@rangzen
Copy link

rangzen commented May 3, 2023

@MrAlias do you remember why these changes by 7c3b1f6?

@youngbupark
Copy link

youngbupark commented May 4, 2023

+1 This results in the high cardinality of request metrics and memory leak in our project. Currently, I am adding the workaround to set req.RemoteAddr to empty before calling otelhttp middleware. But, I hope it will be fixed in the next release.

@rangzen
Copy link

rangzen commented May 4, 2023

Another workaround is to get back to go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.37.0...

youngbupark pushed a commit to radius-project/radius that referenced this issue May 4, 2023
# Description

otelhttp has the high cardinality problem which record all remote
address from client. This issue is tracked by
open-telemetry/opentelemetry-go-contrib#3765 .

This PR is adding workaround to prevent otelhttp from recording
remoteaddr as an attribute.

## Issue reference

<!--
We strive to have all PR being opened based on an issue, where the
problem or feature have been discussed prior to implementation.
-->

#5299
#5237

## Checklist

Please make sure you've completed the relevant tasks for this PR, out of
the following list:

* [ ] Code compiles correctly
* [ ] Adds necessary unit tests for change
* [ ] Adds necessary E2E tests for change
* [ ] Unit tests passing
* [ ] Extended the documentation / Created issue for it

---------

Co-authored-by: Yetkin Timocin <ytimocin@microsoft.com>
@zsgilber
Copy link

Hi all, can we get an acknowledgement that this a real bug, and specifically that this current implementation of the metrics does not match the specification, which restricts those high cardinality attributes to just HTTP clients:

image

@hasheddan
Copy link

Ran into this issue, which I imagine most folks are going to hit when upgrading. It is also applicable to otelgrpc (#3071). The cleanest solution I found in the short-term was to utilize views to filter out the high-cardinality attributes. Interestingly the otel-collector folks hit this same issue with its internal metrics. They employed a similar strategy: open-telemetry/opentelemetry-collector#7543

@youngbupark
Copy link

youngbupark commented May 30, 2023

Another workaround is to get back to go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.37.0...

It doesn't work for my case. The older one doesn't emit status code I need to use. I am filtering out req.RemoteAddr now.

@davendu
Copy link

davendu commented Jun 12, 2023

This is causing issues in our production environment. Our service is not reporting metrics due to the gRPC message too large. I converted the oversized gRPC message into JSON and found otelhttp 0.41.1 is too heavy. The message is then reject by our collector, because it would cause issue for our database.

It seems http.server.request_content_length, http.server.response_content_length and http.server.duration metrics are recording user agent and ip & port info as attributes. It will cause pretty high cardinality, becuase each new attribute requires a whole new bucket, enriching the telemetry message.

Should we consider using a new set of attributes and limit cardinality for metrics?

o := metric.WithAttributes(attributes...)
h.counters[RequestContentLength].Add(ctx, bw.read, o)
h.counters[ResponseContentLength].Add(ctx, rww.written, o)
// Use floating point division here for higher precision (instead of Millisecond method).
elapsedTime := float64(time.Since(requestStartTime)) / float64(time.Millisecond)
h.valueRecorders[ServerLatency].Record(ctx, elapsedTime, o)

davendu added a commit to davendu/opentelemetry-go-contrib that referenced this issue Jun 13, 2023
davendu added a commit to davendu/opentelemetry-go-contrib that referenced this issue Jun 13, 2023
davendu added a commit to davendu/opentelemetry-go-contrib that referenced this issue Jun 13, 2023
@krak3n
Copy link

krak3n commented Jun 19, 2023

We've had to implement our own middleware to remove these labels from metrics. There needs to be a clear distinction between attributes that are not safe for metrics but are fine for tracing.

@davendu
Copy link

davendu commented Jun 19, 2023

We've had to implement our own middleware to remove these labels from metrics. There needs to be a clear distinction between attributes that are not safe for metrics but are fine for tracing.

dmathieu said in its comment about a feature called "view", being discussed in open-telemetry/opentelemetry-specification#3061, and seems reasonable.

@krak3n
Copy link

krak3n commented Jun 20, 2023

Thanks @dmathieu

We recently upgraded to the latest metric SDK versions (was a pain 😅) and have started using views. It seems the Go Metric SDK does support a attribute filter function you can apply to a view:

https://pkg.go.dev/go.opentelemetry.io/otel/sdk/metric#Stream
https://pkg.go.dev/go.opentelemetry.io/otel/attribute#Filter

So this should already be possible to achieve and does seem like a nice solution, thanks for the hint 🙌

@jlordiales
Copy link

is this still considered an issue with the instrumentation library? Or with the way the semconv package defines client/server attributes?

The spec makes a clear distinction between metric and traces attributes so it feels this instrumentation should as well?

@dprotaso
Copy link

I believe this is fixed by #4277 ?

@dmathieu
Copy link
Member

Yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants