Use (Async) ExecChainHandler to measure IOExceptions (#3800) #3801

cachescrubber · 2023-05-02T09:58:22Z

Use HttpRequestRetryStrategy to measure IOExceptions.
Fixes #3800, #3797, #3706

cachescrubber · 2023-05-03T05:11:28Z

The only solution I could find to measure errors in HttpAsyncClient is to use a delegating HttpRequestRetryStrategy which measures after the very last retry. WDYT? Are there other possibilities? The code would need some polishing, I just want verify the suggested solution with you.

cachescrubber · 2023-05-08T08:40:01Z

Unfortunately there are cases when the HttpContext has not been initialized with the Timer.Resource sample.

Basically when errors happen before the actual http request, the HttpRequestInterceptor, which initializes the Timer.Resource sample is never called. Scenarios are for example, IO errors like ConnectionRefused.

possibly cause of #3797?

cachescrubber · 2023-05-12T17:26:31Z

FTR: https://issues.apache.org/jira/browse/HTTPCLIENT-2274

cachescrubber · 2023-05-13T06:23:19Z

This is becoming something bigger then expected.

The current implementation for both async (MicrometerHttpClientInterceptor) and classic (MicrometerHttpRequestExecutor) is implemented in a way that

metrics do not include the time needed to establish a connection.
metrics do not count failed connection attempts.
async metrics even do not include IO errors for established connection (original topic here)

Some experimentation using a HttpRequestRetryStrategy to meter IO errors felt awkward and I was not able to capture errors during connection establishment.

So I checked with the HttpClient Team on their advise how to instrument HttpClient (async and classic). They recommend that we should use (Async) ExecChainHandler to capture execution time metrics.

I update this PR for (currently hc5 only) and implemented the suggested instrumentation.

cachescrubber · 2023-05-15T06:19:55Z

HttpClient has a Feature to execute automatic retries in case of special return codes (503) or retrieable IO exceptions. Default setup (DefaultHttpRequestRetryStrategy) is to retry 1 time after a 1 sec delay.

It is possible to meter the http transaction combined or individually.

combined means

measure the total time over all attempts incl. the inferred delays.
record the outcome of the last attempt.

individual means

measure time for each individual request - excluding the inferred delays.
record the outcome for each attempt.

...io/micrometer/core/instrument/binder/httpcomponents/hc5/MicrometerHttpClientInterceptor.java

- Support ObservationRegistry in AsyncExecChainHandler - Configurable meterName - Introduce ApacheHttpClientMetricsBinder

…n tag. gh-3812

...a/io/micrometer/core/instrument/binder/httpcomponents/hc5/MeteringAsyncExecChainHandler.java

...a/io/micrometer/core/instrument/binder/httpcomponents/hc5/ApacheHttpClientMetricsBinder.java

sonatype-lift · 2023-05-16T15:27:41Z

🛠 Lift Auto-fix

Some of the Lift findings in this PR can be automatically fixed. You can download and apply these changes in your local project directory of your branch to review the suggestions before committing.¹

# Download the patch
curl https://lift.sonatype.com/api/patch/github.com/micrometer-metrics/micrometer/3801.diff -o lift-autofixes.diff

# Apply the patch with git
git apply lift-autofixes.diff

# Review the changes
git diff

Want it all in a single command? Open a terminal in your project's directory and copy and paste the following command:

curl https://lift.sonatype.com/api/patch/github.com/micrometer-metrics/micrometer/3801.diff | git apply

Once you're satisfied, commit and push your changes in your project.

You can preview the patch by opening the patch URL in the browser. ↩

# Conflicts: # micrometer-core/src/test/java/io/micrometer/core/instrument/binder/httpcomponents/hc5/MicrometerHttpClientInterceptorTest.java

…e().

bclozel · 2023-06-26T13:44:00Z

Superseded by #3800 and merged with cb86bc2. Thanks so much for your contribution @cachescrubber !

cachescrubber · 2023-06-26T18:39:27Z

Hi @bclozel I just took the time to review your changes. Nice to see the ObservationRegistry in action without the intermediate ObservationOrTimerCompatibleInstrumentation. Looks much cleaner!

I think you missed on important aspect of my original PR. I introduced support to meterRetries (see #3801 (comment)). Basically, the (Async)HttpClient has a build in retry capability which needs special treatment in the instrumentation, otherwise the observations are not correct.

bclozel · 2023-06-27T13:51:56Z

Hey @cachescrubber - I left this out on purpose, because the behavior is inconsistent between the "classic" and "async" flavours and I'm afraid there's nothing we can do about it.

The AsyncExecChain.Scope#execCount (the number of executions, > 1 meaning it's a retry) is only available to the async variant. There is no such thing in ExecChain.Scope and the execCount information in HttpRequestRetryExec is not made available to the rest of the chain. I initially thought about adding a "retry.count"="1" as a key value but this won't work for the "classic" version.

Also, setting up the micrometer ChainHandler before the retry one does not yield the same behavior in "classic" and "async":

in "classic", we only see a single execution from the ObservationExecChainHandler and record only one observation
in "async", the ObservationExecChainHandler is called for each retry, creating a new observation each time

Because of this, I don't think we can consistently produce information as Keyvalue or directly in Observation.Context. If you have suggestions let me know.

cachescrubber · 2023-06-27T14:25:53Z

Hi @bclozel - you are right - instrumentation is a bit different between classic and async. This is why I introduced

ApacheHttpClientMetricsBinder#instrument(org.apache.hc.client5.http.impl.async.HttpAsyncClientBuilder)
ApacheHttpClientMetricsBinder#instrument(org.apache.hc.client5.http.impl.classic.HttpClientBuilder)

this way, the actual placement of the handler inside the handler-chain becomes an implementation detail, an no longer the responsibility of the user. Also, the (Async)ExecChainHandler implementations were package-private, not extensible.

I'm pretty sure I covered the retry scenarios with integration tests showing consistent behavior for both async and classic. Please see (in my last version)

hc5.ApacheHttpClientMetricsBinderTest#retriesAreMetered_overall
hc5.ApacheHttpClientMetricsBinderTest#retriesAreMetered_individually
hc5.ApacheHttpClientMetricsBinderTest#testPositiveOutcomeAfterRetry_overall
hc5.ApacheHttpClientMetricsBinderTest#testPositiveOutcomeAfterRetry_individual

I actually think the tests would show that specially the async meters are not correct like they are right now.

Next to the placement within the handler-chain, (scope.execCount.get() == 1) is the crucial part in the MeteringAsyncExecChainHandler in order to get correct results.

I think I could submit a new PR starting with a new Integration Test showing/demonstrating the issue. Did you removed the Wiremock based tests on purpose? Because I'm not sure how to mock the required behaviour.

bclozel · 2023-06-27T14:53:12Z

instrumentation is a bit different between classic and async. This is why I introduced

ApacheHttpClientMetricsBinder#instrument(org.apache.hc.client5.http.impl.async.HttpAsyncClientBuilder)
ApacheHttpClientMetricsBinder#instrument(org.apache.hc.client5.http.impl.classic.HttpClientBuilder)

I'd rather not tie our instrumentation to implementations details of another ExecChain implementation. Given the inconsistencies and the lack of support for cancellation overall, I don't think this would be wise.

this way, the actual placement of the handler inside the handler-chain becomes an implementation detail, an no longer the responsibility of the user. Also, the (Async)ExecChainHandler implementations were package-private, not extensible.

I don't think your initial PR was supporting cancellations and the implementation was still leaking observations. With this in place, I think that the micrometer ExecChain must be configured before the retry one because of internal lifecycle issues with the connection resources. Maybe that's something we should document on the ObservationExecChainHandler (although we're already documenting that it should be configured first).

I actually think the tests would show that specially the async meters are not correct like they are right now.

I think they would be consistent with the behavior of the library: in the async case retries trigger the entire chain, whereas in the classic case this happens internally.

Next to the placement within the handler-chain, (scope.execCount.get() == 1) is the crucial part in the MeteringAsyncExecChainHandler in order to get correct results.

I see where you're coming from, but this merely hides the fact that several separate requests were made. If I understand correctly, if retries are not metered, there is a single observation that does not cover all retries in the async mode. Depending on how the setup is made, this could be the opposite for the classic case.

With all that in mind, I don't think we should try to compensate for the inconsistencies in the library itself.

cachescrubber · 2023-06-28T07:15:31Z

I don't think your initial PR was supporting cancellations and the implementation was still leaking observations. With this in place, I think that the micrometer ExecChain must be configured before the retry one because of internal lifecycle issues with the connection resources.

Yes, my initial PR did not support cancellations. I created another PR reinstating my meterRetries support over your much improved HandlerChain implementation. It would be great if you could have a look at #3941. Is it still leaking observations or does your async cancellation support also kicks in here?

cachescrubber added 2 commits May 2, 2023 11:56

Add test case for gh-3800

b4ff0cf

Use HttpRequestRetryStrategy to measure errors (#gh-3800)

6a336b9

Use a static inner class.

f3b3ec3

cachescrubber changed the title ~~Test case to reproduce gh-3800~~ Use HttpRequestRetryStrategy to measure IOExceptions (#3800) May 5, 2023

make clear connection related errors could not be metered

00c24ea

cachescrubber mentioned this pull request May 10, 2023

Instrumentations for Apache HttpComponents do not meter errors and leak memory #3800

Closed

cachescrubber added 7 commits May 10, 2023 08:58

Checksytle

f4ddaa7

Use InternalLogger

87fb4ea

Replace Request and Response Interceptor using AsyncExecChainHandler

6e3daea

format

dcbb8b1

Adopt verification test

d2a7d40

convert MicrometerHttpRequestExecutor to ExecChainHandler

f46e8ba

Adjusts timeouts

7bf0709

cachescrubber changed the title ~~Use HttpRequestRetryStrategy to measure IOExceptions (#3800)~~ Use (Async) ExecChainHandler to measure IOExceptions (#3800) May 13, 2023

cachescrubber added 2 commits May 15, 2023 08:02

Assure retries are metered correctly

899762c

Restore logback.xml and build.gradle

c058e0a

sonatype-lift bot reviewed May 15, 2023

View reviewed changes

...io/micrometer/core/instrument/binder/httpcomponents/hc5/MicrometerHttpClientInterceptor.java Outdated Show resolved Hide resolved

cachescrubber mentioned this pull request May 15, 2023

Apache HTTP async client metrics use HttpContext in weird way, NPE is thrown #3797

Closed

cachescrubber added 3 commits May 15, 2023 08:30

AsyncMeteringExecChainHandler is now static

b803db4

Introduce ApacheHttpClientMetricsBinder

72c41db

- Support ObservationRegistry in AsyncExecChainHandler - Configurable meterName - Introduce ApacheHttpClientMetricsBinder

Enhance Apache HttpComponents instrumentation to support the exceptio…

92637e1

…n tag. gh-3812

sonatype-lift bot reviewed May 16, 2023

View reviewed changes

...a/io/micrometer/core/instrument/binder/httpcomponents/hc5/MeteringAsyncExecChainHandler.java Outdated Show resolved Hide resolved

sonatype-lift bot reviewed May 16, 2023

View reviewed changes

...a/io/micrometer/core/instrument/binder/httpcomponents/hc5/ApacheHttpClientMetricsBinder.java Outdated Show resolved Hide resolved

Fix sonatype issues

ef7a139

cachescrubber added 3 commits June 2, 2023 14:54

Merge branch 'main' into gh-3800

c0d1fc6

# Conflicts: # micrometer-core/src/test/java/io/micrometer/core/instrument/binder/httpcomponents/hc5/MicrometerHttpClientInterceptorTest.java

Merge branch 'main' into gh-3800

73822c0

# Conflicts: # micrometer-core/src/test/java/io/micrometer/core/instrument/binder/httpcomponents/hc5/MicrometerHttpClientInterceptorTest.java

Convert assertion to assertj, remove unnecessary nested assertThatCod…

74719c8

…e().

Stephan202 mentioned this pull request Jun 19, 2023

Memory leak in MicrometerHttpClientInterceptor #3920

Closed

bclozel self-assigned this Jun 20, 2023

bclozel closed this Jun 26, 2023

bclozel added the superseded An issue that has been superseded by another label Jun 26, 2023

cachescrubber mentioned this pull request Jun 28, 2023

Introduce ApacheHttpClientMetricsBinder to support metering retries. #3941

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use (Async) ExecChainHandler to measure IOExceptions (#3800) #3801

Use (Async) ExecChainHandler to measure IOExceptions (#3800) #3801

cachescrubber commented May 2, 2023 •

edited

cachescrubber commented May 3, 2023

cachescrubber commented May 8, 2023

cachescrubber commented May 12, 2023

cachescrubber commented May 13, 2023

cachescrubber commented May 15, 2023

sonatype-lift bot commented May 16, 2023

bclozel commented Jun 26, 2023

cachescrubber commented Jun 26, 2023

bclozel commented Jun 27, 2023

cachescrubber commented Jun 27, 2023 •

edited

bclozel commented Jun 27, 2023

cachescrubber commented Jun 28, 2023

Use (Async) ExecChainHandler to measure IOExceptions (#3800) #3801

Use (Async) ExecChainHandler to measure IOExceptions (#3800) #3801

Conversation

cachescrubber commented May 2, 2023 • edited

cachescrubber commented May 3, 2023

cachescrubber commented May 8, 2023

cachescrubber commented May 12, 2023

cachescrubber commented May 13, 2023

cachescrubber commented May 15, 2023

sonatype-lift bot commented May 16, 2023

🛠 Lift Auto-fix

Footnotes

bclozel commented Jun 26, 2023

cachescrubber commented Jun 26, 2023

bclozel commented Jun 27, 2023

cachescrubber commented Jun 27, 2023 • edited

bclozel commented Jun 27, 2023

cachescrubber commented Jun 28, 2023

cachescrubber commented May 2, 2023 •

edited

cachescrubber commented Jun 27, 2023 •

edited