Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stackdriver exception trying to publish metrics during shutdown #4353

Closed
dimovelev opened this issue Nov 13, 2023 · 14 comments · Fixed by #4358
Closed

Stackdriver exception trying to publish metrics during shutdown #4353

dimovelev opened this issue Nov 13, 2023 · 14 comments · Fixed by #4358
Labels
bug A general bug registry: stackdriver A StackDriver Registry related issue
Milestone

Comments

@dimovelev
Copy link
Contributor

dimovelev commented Nov 13, 2023

Describe the bug
Whenever an application is shut down, the stack driver registry tries to publish its metrics and throws an exception with the error UnavailableException: io.grpc.StatusRuntimeException: UNAVAILABLE: Channel shutdown invoked

Environment
Google Cloud Run
Spring Boot 3.1.5
micrometer-registry-stackdriver 1.11.5
spring-cloud-gcp 4.8.2

To Reproduce

  1. Create an empty spring boot application e.g. from https://start.spring.io/
  2. Add the following dependencies
<dependency>
	<groupId>com.google.cloud</groupId>
	<artifactId>spring-cloud-gcp-starter</artifactId>
</dependency>
<dependency>
	<groupId>org.springframework.boot</groupId>
	<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
	<groupId>io.micrometer</groupId>
	<artifactId>micrometer-registry-stackdriver</artifactId>
</dependency>
  1. Enable the micrometer metrics (e.g. application.properties)
spring.cloud.gcp.metrics.enabled=true
  1. Login to GCP so that there are default credentials etc for GCP available
  2. Run the application. It will start and shutdown immediately as we have not added web server
  3. During shutdown you will get this exception stack trace
2023-11-13T10:21:08.451+01:00  INFO 25564 --- [           main] com.example.demo.DemoApplication         : Starting DemoApplication using Java 21 with PID 25564 (D:\src\demo\target\classes started by XXX in D:\src\demo)
2023-11-13T10:21:08.453+01:00  INFO 25564 --- [           main] com.example.demo.DemoApplication         : No active profile set, falling back to 1 default profile: "default"
2023-11-13T10:21:09.076+01:00  INFO 25564 --- [           main] c.g.c.s.core.DefaultCredentialsProvider  : Default credentials provider for user XXX.apps.googleusercontent.com
2023-11-13T10:21:09.076+01:00  INFO 25564 --- [           main] c.g.c.s.core.DefaultCredentialsProvider  : Scopes in use by default credentials: [https://www.googleapis.com/auth/pubsub, https://www.googleapis.com/auth/spanner.admin, https://www.googleapis.com/auth/spanner.data, https://www.googleapis.com/auth/datastore, https://www.googleapis.com/auth/sqlservice.admin, https://www.googleapis.com/auth/devstorage.read_only, https://www.googleapis.com/auth/devstorage.read_write, https://www.googleapis.com/auth/cloudruntimeconfig, https://www.googleapis.com/auth/trace.append, https://www.googleapis.com/auth/cloud-platform, https://www.googleapis.com/auth/cloud-vision, https://www.googleapis.com/auth/bigquery, https://www.googleapis.com/auth/monitoring.write]
2023-11-13T10:21:09.413+01:00  INFO 25564 --- [           main] c.g.c.s.a.c.GcpContextAutoConfiguration  : The default project ID is XXX
2023-11-13T10:21:10.052+01:00  INFO 25564 --- [           main] i.m.c.instrument.push.PushMeterRegistry  : publishing metrics for StackdriverMeterRegistry every 1m
2023-11-13T10:21:10.415+01:00  INFO 25564 --- [           main] com.example.demo.DemoApplication         : Started DemoApplication in 2.263 seconds (process running for 2.877)
2023-11-13T10:21:10.473+01:00  WARN 25564 --- [ionShutdownHook] i.m.s.StackdriverMeterRegistry           : Failed to pre populate verified descriptors for XXX

java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@2f35496c[Not completed, task = java.util.concurrent.Executors$RunnableAdapter@451b64f4[Wrapped task = TrustedListenableFutureTask@285303a0[status=PENDING, info=[task=[running=[NOT STARTED YET], com.google.api.gax.rpc.AttemptCallable@7fa60050]]]]] rejected from java.util.concurrent.ScheduledThreadPoolExecutor@7d04529c[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]
	at java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2081) ~[na:na]
	at java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:841) ~[na:na]
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:340) ~[na:na]
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:562) ~[na:na]
	at com.google.common.util.concurrent.MoreExecutors$ScheduledListeningDecorator.schedule(MoreExecutors.java:673) ~[guava-32.1.2-jre.jar:na]
	at com.google.api.gax.retrying.ScheduledRetryingExecutor.submit(ScheduledRetryingExecutor.java:116) ~[gax-2.37.0.jar:2.37.0]
	at com.google.api.gax.retrying.CallbackChainRetryingFuture$AttemptCompletionListener.handle(CallbackChainRetryingFuture.java:137) ~[gax-2.37.0.jar:2.37.0]
	at com.google.api.gax.retrying.CallbackChainRetryingFuture$AttemptCompletionListener.run(CallbackChainRetryingFuture.java:117) ~[gax-2.37.0.jar:2.37.0]
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31) ~[guava-32.1.2-jre.jar:na]
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1286) ~[guava-32.1.2-jre.jar:na]
	at com.google.common.util.concurrent.AbstractFuture.addListener(AbstractFuture.java:760) ~[guava-32.1.2-jre.jar:na]
	at com.google.api.core.AbstractApiFuture.addListener(AbstractApiFuture.java:45) ~[api-common-2.20.0.jar:2.20.0]
	at com.google.api.gax.retrying.CallbackChainRetryingFuture.setAttemptFuture(CallbackChainRetryingFuture.java:93) ~[gax-2.37.0.jar:2.37.0]
	at com.google.api.gax.rpc.AttemptCallable.call(AttemptCallable.java:88) ~[gax-2.37.0.jar:2.37.0]
	at com.google.api.gax.rpc.RetryingCallable.futureCall(RetryingCallable.java:63) ~[gax-2.37.0.jar:2.37.0]
	at com.google.api.gax.rpc.RetryingCallable.futureCall(RetryingCallable.java:41) ~[gax-2.37.0.jar:2.37.0]
	at com.google.api.gax.rpc.PagedCallable.futureCall(PagedCallable.java:62) ~[gax-2.37.0.jar:2.37.0]
	at com.google.api.gax.rpc.UnaryCallable$1.futureCall(UnaryCallable.java:126) ~[gax-2.37.0.jar:2.37.0]
	at com.google.api.gax.rpc.UnaryCallable.futureCall(UnaryCallable.java:87) ~[gax-2.37.0.jar:2.37.0]
	at com.google.api.gax.rpc.UnaryCallable.call(UnaryCallable.java:112) ~[gax-2.37.0.jar:2.37.0]
	at com.google.cloud.monitoring.v3.MetricServiceClient.listMetricDescriptors(MetricServiceClient.java:741) ~[google-cloud-monitoring-3.31.0.jar:3.31.0]
	at io.micrometer.stackdriver.StackdriverMeterRegistry$Batch.prePopulateVerifiedDescriptors(StackdriverMeterRegistry.java:429) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5]
	at io.micrometer.stackdriver.StackdriverMeterRegistry$Batch.createMetricDescriptorIfNecessary(StackdriverMeterRegistry.java:383) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5]
	at io.micrometer.stackdriver.StackdriverMeterRegistry$Batch.createTimeSeries(StackdriverMeterRegistry.java:358) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5]
	at io.micrometer.stackdriver.StackdriverMeterRegistry$Batch.createTimeSeries(StackdriverMeterRegistry.java:324) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5]
	at io.micrometer.stackdriver.StackdriverMeterRegistry.createGauge(StackdriverMeterRegistry.java:227) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5]
	at io.micrometer.stackdriver.StackdriverMeterRegistry.lambda$publish$1(StackdriverMeterRegistry.java:158) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5]
	at io.micrometer.core.instrument.Meter.match(Meter.java:100) ~[micrometer-core-1.11.5.jar:1.11.5]
	at io.micrometer.stackdriver.StackdriverMeterRegistry.lambda$publish$10(StackdriverMeterRegistry.java:158) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5]
	at java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:273) ~[na:na]
	at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1708) ~[na:na]
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) ~[na:na]
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) ~[na:na]
	at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921) ~[na:na]
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[na:na]
	at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682) ~[na:na]
	at io.micrometer.stackdriver.StackdriverMeterRegistry.publish(StackdriverMeterRegistry.java:163) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5]
	at io.micrometer.core.instrument.step.StepMeterRegistry.close(StepMeterRegistry.java:146) ~[micrometer-core-1.11.5.jar:1.11.5]
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) ~[na:na]
	at java.base/java.lang.reflect.Method.invoke(Method.java:580) ~[na:na]
	at org.springframework.beans.factory.support.DisposableBeanAdapter.invokeCustomDestroyMethod(DisposableBeanAdapter.java:303) ~[spring-beans-6.0.13.jar:6.0.13]
	at org.springframework.beans.factory.support.DisposableBeanAdapter.destroy(DisposableBeanAdapter.java:238) ~[spring-beans-6.0.13.jar:6.0.13]
	at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroyBean(DefaultSingletonBeanRegistry.java:587) ~[spring-beans-6.0.13.jar:6.0.13]
	at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingleton(DefaultSingletonBeanRegistry.java:559) ~[spring-beans-6.0.13.jar:6.0.13]
	at org.springframework.beans.factory.support.DefaultListableBeanFactory.destroySingleton(DefaultListableBeanFactory.java:1189) ~[spring-beans-6.0.13.jar:6.0.13]
	at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingletons(DefaultSingletonBeanRegistry.java:520) ~[spring-beans-6.0.13.jar:6.0.13]
	at org.springframework.beans.factory.support.DefaultListableBeanFactory.destroySingletons(DefaultListableBeanFactory.java:1182) ~[spring-beans-6.0.13.jar:6.0.13]
	at org.springframework.context.support.AbstractApplicationContext.destroyBeans(AbstractApplicationContext.java:1118) ~[spring-context-6.0.13.jar:6.0.13]
	at org.springframework.context.support.AbstractApplicationContext.doClose(AbstractApplicationContext.java:1084) ~[spring-context-6.0.13.jar:6.0.13]
	at org.springframework.context.support.AbstractApplicationContext.close(AbstractApplicationContext.java:1035) ~[spring-context-6.0.13.jar:6.0.13]
	at org.springframework.boot.SpringApplicationShutdownHook.closeAndWait(SpringApplicationShutdownHook.java:145) ~[spring-boot-3.1.5.jar:3.1.5]
	at java.base/java.lang.Iterable.forEach(Iterable.java:75) ~[na:na]
	at org.springframework.boot.SpringApplicationShutdownHook.run(SpringApplicationShutdownHook.java:114) ~[spring-boot-3.1.5.jar:3.1.5]
	at java.base/java.lang.Thread.run(Thread.java:1583) ~[na:na]
	Suppressed: com.google.api.gax.rpc.AsyncTaskException: Asynchronous task failed
		at com.google.api.gax.rpc.ApiExceptions.callAndTranslateApiException(ApiExceptions.java:57) ~[gax-2.37.0.jar:2.37.0]
		... 35 common frames omitted

2023-11-13T10:21:10.486+01:00  WARN 25564 --- [ionShutdownHook] i.m.s.StackdriverMeterRegistry           : failed to create metric descriptor in Stackdriver for meter MeterId{name='jvm.buffer.memory.used', tags=[tag(id=direct)]}

com.google.api.gax.rpc.UnavailableException: io.grpc.StatusRuntimeException: UNAVAILABLE: Channel shutdown invoked
	at com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:112) ~[gax-2.37.0.jar:2.37.0]
	at com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:41) ~[gax-2.37.0.jar:2.37.0]
	at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:86) ~[gax-grpc-2.37.0.jar:2.37.0]
	at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:66) ~[gax-grpc-2.37.0.jar:2.37.0]
	at com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97) ~[gax-grpc-2.37.0.jar:2.37.0]
	at com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:84) ~[api-common-2.20.0.jar:2.20.0]
	at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1127) ~[guava-32.1.2-jre.jar:na]
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31) ~[guava-32.1.2-jre.jar:na]
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1286) ~[guava-32.1.2-jre.jar:na]
	at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1055) ~[guava-32.1.2-jre.jar:na]
	at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:807) ~[guava-32.1.2-jre.jar:na]
	at io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:568) ~[grpc-stub-1.59.0.jar:1.59.0]
	at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:538) ~[grpc-stub-1.59.0.jar:1.59.0]
	at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) ~[grpc-api-1.59.0.jar:1.59.0]
	at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) ~[grpc-api-1.59.0.jar:1.59.0]
	at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) ~[grpc-api-1.59.0.jar:1.59.0]
	at com.google.api.gax.grpc.ChannelPool$ReleasingClientCall$1.onClose(ChannelPool.java:570) ~[gax-grpc-2.37.0.jar:2.37.0]
	at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:574) ~[grpc-core-1.59.0.jar:1.59.0]
	at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:72) ~[grpc-core-1.59.0.jar:1.59.0]
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:742) ~[grpc-core-1.59.0.jar:1.59.0]
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:723) ~[grpc-core-1.59.0.jar:1.59.0]
	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) ~[grpc-core-1.59.0.jar:1.59.0]
	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133) ~[grpc-core-1.59.0.jar:1.59.0]
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[na:na]
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[na:na]
	at java.base/java.lang.Thread.run(Thread.java:1583) ~[na:na]
	Suppressed: com.google.api.gax.rpc.AsyncTaskException: Asynchronous task failed
		at com.google.api.gax.rpc.ApiExceptions.callAndTranslateApiException(ApiExceptions.java:57) ~[gax-2.37.0.jar:2.37.0]
		at com.google.api.gax.rpc.UnaryCallable.call(UnaryCallable.java:112) ~[gax-2.37.0.jar:2.37.0]
		at com.google.cloud.monitoring.v3.MetricServiceClient.createMetricDescriptor(MetricServiceClient.java:1127) ~[google-cloud-monitoring-3.31.0.jar:3.31.0]
		at io.micrometer.stackdriver.StackdriverMeterRegistry$Batch.createMetricDescriptorIfNecessary(StackdriverMeterRegistry.java:405) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5]
		at io.micrometer.stackdriver.StackdriverMeterRegistry$Batch.createTimeSeries(StackdriverMeterRegistry.java:358) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5]
		at io.micrometer.stackdriver.StackdriverMeterRegistry$Batch.createTimeSeries(StackdriverMeterRegistry.java:324) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5]
		at io.micrometer.stackdriver.StackdriverMeterRegistry.createGauge(StackdriverMeterRegistry.java:227) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5]
		at io.micrometer.stackdriver.StackdriverMeterRegistry.lambda$publish$1(StackdriverMeterRegistry.java:158) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5]
		at io.micrometer.core.instrument.Meter.match(Meter.java:100) ~[micrometer-core-1.11.5.jar:1.11.5]
		at io.micrometer.stackdriver.StackdriverMeterRegistry.lambda$publish$10(StackdriverMeterRegistry.java:158) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5]
		at java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:273) ~[na:na]
		at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1708) ~[na:na]
		at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) ~[na:na]
		at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) ~[na:na]
		at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921) ~[na:na]
		at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[na:na]
		at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682) ~[na:na]
		at io.micrometer.stackdriver.StackdriverMeterRegistry.publish(StackdriverMeterRegistry.java:163) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5]
		at io.micrometer.core.instrument.step.StepMeterRegistry.close(StepMeterRegistry.java:146) ~[micrometer-core-1.11.5.jar:1.11.5]
		at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) ~[na:na]
		at java.base/java.lang.reflect.Method.invoke(Method.java:580) ~[na:na]
		at org.springframework.beans.factory.support.DisposableBeanAdapter.invokeCustomDestroyMethod(DisposableBeanAdapter.java:303) ~[spring-beans-6.0.13.jar:6.0.13]
		at org.springframework.beans.factory.support.DisposableBeanAdapter.destroy(DisposableBeanAdapter.java:238) ~[spring-beans-6.0.13.jar:6.0.13]
		at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroyBean(DefaultSingletonBeanRegistry.java:587) ~[spring-beans-6.0.13.jar:6.0.13]
		at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingleton(DefaultSingletonBeanRegistry.java:559) ~[spring-beans-6.0.13.jar:6.0.13]
		at org.springframework.beans.factory.support.DefaultListableBeanFactory.destroySingleton(DefaultListableBeanFactory.java:1189) ~[spring-beans-6.0.13.jar:6.0.13]
		at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingletons(DefaultSingletonBeanRegistry.java:520) ~[spring-beans-6.0.13.jar:6.0.13]
		at org.springframework.beans.factory.support.DefaultListableBeanFactory.destroySingletons(DefaultListableBeanFactory.java:1182) ~[spring-beans-6.0.13.jar:6.0.13]
		at org.springframework.context.support.AbstractApplicationContext.destroyBeans(AbstractApplicationContext.java:1118) ~[spring-context-6.0.13.jar:6.0.13]
		at org.springframework.context.support.AbstractApplicationContext.doClose(AbstractApplicationContext.java:1084) ~[spring-context-6.0.13.jar:6.0.13]
		at org.springframework.context.support.AbstractApplicationContext.close(AbstractApplicationContext.java:1035) ~[spring-context-6.0.13.jar:6.0.13]
		at org.springframework.boot.SpringApplicationShutdownHook.closeAndWait(SpringApplicationShutdownHook.java:145) ~[spring-boot-3.1.5.jar:3.1.5]
		at java.base/java.lang.Iterable.forEach(Iterable.java:75) ~[na:na]
		at org.springframework.boot.SpringApplicationShutdownHook.run(SpringApplicationShutdownHook.java:114) ~[spring-boot-3.1.5.jar:3.1.5]
		... 1 common frames omitted
Caused by: io.grpc.StatusRuntimeException: UNAVAILABLE: Channel shutdown invoked
	at io.grpc.Status.asRuntimeException(Status.java:537) ~[grpc-api-1.59.0.jar:1.59.0]
	... 14 common frames omitted

Expected behavior

  • No exception during shutdown
  • The metrics are successfully pushed during shutdown

Additional context
I have not debugged through the code. However, the io.micrometer.core.instrument.push.PushMeterRegistry#close seems to call stop(), publish(), close() when the context is closed. The io.micrometer.stackdriver.StackdriverMeterRegistry#stop calls client.shutdownNow() which I guess is not correct. Maybe this should be rather done in close() and not in stop(). Just a wild guess.

@dimovelev
Copy link
Contributor Author

FYI - locally I changed the two methods and do not get this exception anymore (I get a different one - something with header size too large but it seems to be a different issue):

    @Override
    public void stop() { // or just remove me
        super.stop();
    }

    @Override
    public void close() {
        try {
            super.close();
        } finally {
            if (client != null) {
                client.close();
            }
        }
    }

@jonatan-ivanov
Copy link
Member

Is this happening all the time or occasionally? If the latter, how frequently?
We changed a few things in 1.11.6 (and 1.12.0), would you mind checking if it resolves the issue?

@jonatan-ivanov jonatan-ivanov added waiting for feedback We need additional information before we can continue and removed waiting-for-triage labels Nov 17, 2023
@dimovelev
Copy link
Contributor Author

@jonatan-ivanov this happens only when the application is being shut down - e.g. a new revision is being deployed and all old revision instances are being shut down. The bigger impact is actually that it also happens in cloudrun jobs when they finish and as a result the metrics are not pushed when they finish.

@dimovelev
Copy link
Contributor Author

@jonatan-ivanov i changed the micrometer.version maven property in the reproducer to use a different micrometer version (and verified that it is being used) and it still fails both with 1.11.6 and with 1.12.0

@mieseprem
Copy link

Hey there, it seems we are facing the same issue. To get more ideas of what changes lead to the observed behaviour I will write this comment.

We started to face this issue when we migrated from Springboot 2.7 to 3.1.
Our workaround was (still is) to pin Micrometer to version 1.10.6 (all afterwards - including 1.10.7) shows all the error messages you described above.
However, as of Springboot 3.2 the (ugly) Micrometer version pinning isn't working anyway. Thats why I was googleing around and stranded here.

def micrometer_version_pin = '1.10.6'
dependencies {
    // [...]

    implementation "io.micrometer:micrometer-commons:${micrometer_version_pin}!!"
    implementation "io.micrometer:micrometer-core:${micrometer_version_pin}!!"
    implementation "io.micrometer:micrometer-observation:${micrometer_version_pin}!!"
    implementation "io.micrometer:micrometer-registry-stackdriver:${micrometer_version_pin}!!"

   // [...]
}

@shakuzen shakuzen added bug A general bug registry: stackdriver A StackDriver Registry related issue and removed waiting for feedback We need additional information before we can continue labels Dec 11, 2023
@shakuzen shakuzen added this to the 1.10.x milestone Dec 11, 2023
@shakuzen
Copy link
Member

Our workaround was (still is) to pin Micrometer to version 1.10.6 (all afterwards - including 1.10.7) shows all the error messages you described above.

Thank you for the details. That sounds like it was #3759, included in 1.10.7, that caused this to start happening, which makes sense. The proposed fix sounds reasonable. I'd like to see if there is any way we can effectively test the behavior locally with a stub of the stackdriver metrics service. That could help us avoid reintroducing this issue in the future and make sure it stays fixed.

@dimovelev
Copy link
Contributor Author

@shakuzen I could not find an emulator or anything remotely helpful to integration test it :(

@marcingrzejszczak
Copy link
Contributor

So for StackdriverMeterRegistry, wouldn't a change from

@Override
    public void stop() {
        if (client != null)
            client.shutdownNow();
        super.stop();
    }

to

@Override
    public void stop() {
        if (client != null && !client.isShutdown())
            client.shutdownNow();
        super.stop();
    }

fix the problem? I understand that we're closing an already closed client? We would need to check if the metrics are still being sent on stop. Still no luck with providing some stub for Stackdriver metrics server?

@dimovelev
Copy link
Contributor Author

dimovelev commented Jan 8, 2024

@marcingrzejszczak would that not mean that no metrics will be pushed upon shutdown? We do not want to get the exception, but we also want to push the metrics one last time upon shutdown.

@marcingrzejszczak
Copy link
Contributor

From my understanding the suggested change would shut down the client only if it hasn't been already shut down but maybe I'm missing sth.

@marcingrzejszczak marcingrzejszczak added the waiting for feedback We need additional information before we can continue label Jan 10, 2024
@mieseprem
Copy link

For me it looks like the most fundamental change is to move shutdown of the client from stop() to close() method. I don't know in which phase of application lifecycle the phases take place. But maybe this already does the trick.

@shakuzen shakuzen modified the milestones: 1.10.x, 1.10.14 Jan 31, 2024
@shakuzen shakuzen removed the waiting for feedback We need additional information before we can continue label Jan 31, 2024
@shakuzen
Copy link
Member

shakuzen commented Feb 2, 2024

Could folks affected by this issue please try with the latest snapshots to make sure the fix is working and not causing any other issues? Specifically, it's in 1.10.14-SNAPSHOT, 1.11.9-SNAPSHOT, and 1.12.3-SNAPSHOT available from the snapshot repo.

@sebastiankonieczek
Copy link

1.12.3-SNAPSHOT -> Fix seems to work properly, metrics are being published and client is shutdown gracefully afterwards:

2024-02-06T10:55:07.340+01:00 DEBUG 19503 --- [tp1973538135-41] i.m.s.StackdriverMeterRegistry           : successfully sent 9 TimeSeries to Stackdriver
2024-02-06T10:55:07.341+01:00 DEBUG 19503 --- [tp1973538135-41] com.google.api.gax.grpc.ChannelPool      : Initiating immediate shutdown due to explicit request
2024-02-06T10:55:07.347+01:00 DEBUG 19503 --- [tp1973538135-41] com.google.api.gax.grpc.ChannelPool      : Initiating graceful shutdown due to explicit request

No other issue recognized so far.

@dimovelev
Copy link
Contributor Author

I was a bit slow to respond - we have this fix already rolled out to multiple applications and it works. Thank you for merging it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A general bug registry: stackdriver A StackDriver Registry related issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants