Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: TracerProvider can be dropped unknowingly #1625

Open
SZenglein opened this issue Mar 15, 2024 · 5 comments
Open

[Bug]: TracerProvider can be dropped unknowingly #1625

SZenglein opened this issue Mar 15, 2024 · 5 comments
Assignees
Labels
enhancement New feature or request triage:accepted Has been triaged and accepted.

Comments

@SZenglein
Copy link

SZenglein commented Mar 15, 2024

What happened?

Consider this example code:
(I am using tracing_opentelemetry, but it should be reproducible without it)

let telemetry =  {
    // Create a new OpenTelemetry trace pipeline that prints to stdout
    let provider = TracerProvider::builder()
        .with_simple_exporter(
            opentelemetry_otlp::new_exporter()
                .tonic()
                .with_endpoint("http:localhost:4317")
                .with_timeout(Duration::from_secs(3))
                .build_span_exporter()?,
        )
        .with_config(opentelemetry_sdk::trace::Config::default().with_resource(
            Resource::new(vec![KeyValue::new(
                opentelemetry_semantic_conventions::resource::SERVICE_NAME,
                "service-api",
            )]),
        ))
        .build();
    
    let tracer = provider.tracer("service-api");
    
    // Create and return a tracing layer with the configured tracer
    tracing_opentelemetry::layer().with_tracer(tracer)
};

Since the provider is not used afterwards, it is very easy to accidentally drop it when executing inside a function or conditional block.
There is no compiler warning or any documentation note regarding this behaviour or that it must be kept alive manually. This results in very hard to debug problem, because code that compiles fine without a warning can stop working just by moving a value into a scope.

I suspect this is because TracerProvider creates the Tracer by downgrading the Arc to a Weak.

IMHO this should at the very least be documented on the TracerProvider struct, maybe a warning can be added to the Drop Implementation, too.

API Version

0.15

SDK Version

0.22.1

What Exporters are you seeing the problem on?

No response

Relevant log output

No response

@SZenglein SZenglein added bug Something isn't working triage:todo Needs to be traiged. labels Mar 15, 2024
@TommyCpp TommyCpp added documentation/examples Improvements or additions to documentation or examples triage:accepted Has been triaged and accepted. and removed bug Something isn't working triage:todo Needs to be traiged. labels Mar 15, 2024
@TommyCpp
Copy link
Contributor

Yeah it's a known behavior. Users need to manually hold reference to provider or pass the provider to global tracer provider.

Agree on we should update documentation and possible looking into compiler warnings

@lalitb
Copy link
Member

lalitb commented Mar 15, 2024

Will it make sense for TraceProvider to have similar behavior as LoggerProvider (#1455) - Let Tracer own TracerProvider (with shared referenece to TracerProviderInner), which would ensure that it won't be dropped till all instances of Tracers are dropped.

@TommyCpp TommyCpp added enhancement New feature or request and removed documentation/examples Improvements or additions to documentation or examples labels Mar 15, 2024
@TommyCpp
Copy link
Contributor

Ah you are right. Let me give it a try

TommyCpp added a commit to TommyCpp/opentelemetry-rust that referenced this issue Mar 18, 2024
@TommyCpp
Copy link
Contributor

TommyCpp commented Mar 18, 2024

Like pointed out in #1455 after I did some POC I realized the drawback of curent implementation is we will no longer shutdown tracer/logger processor when dropping TracerProvider.

Considering the following unit test that validates we will not send out spans after tracer provider shuts down

    #[test]
    fn test_span_exported_data() {
        let provider = crate::trace::TracerProvider::builder()
            .with_simple_exporter(NoopSpanExporter::new())
            .build();
        let tracer = provider.tracer("test");

        let mut span = tracer.start("test_span");
        span.add_event("test_event", vec![]);
        span.set_status(Status::error(""));

        let exported_data = span.exported_data();
        assert!(exported_data.is_some());

        drop(provider);
        let dropped_span = tracer.start("span_with_dropped_provider");
        // return none if the provider has already been dropped
        // but this won't happen if the tracer still holds a reference to tracer inner
        assert!(dropped_span.exported_data().is_none());
    }

I think there are two questions we need to answer:

  1. Do we need to shutdown SpanProcssors when TracerProvider drops?
  2. How do we make sure the span exporting stops once the SpanProcessor shuts down?
    • LoggerProvider has try_shutdown which requries users to manualy track down all cloned LoggerProvider, given every Logger has a reference to LoggerProvider this means users is responsible for clean up Logger before LoggerProvider
    • Is it possible for us to keep a Weak reference to every Tracer created from a givenTracerProviderInner. Upon shutting down, the TracerProviderInner notify all Tracer and ask them to remove their reference to TracerProviderInner.

@cijothomas
Copy link
Member

https://github.com/open-telemetry/opentelemetry-rust/blob/main/opentelemetry/src/trace/mod.rs#L20-L31 - The recently added doc updates should cover this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request triage:accepted Has been triaged and accepted.
Projects
None yet
Development

No branches or pull requests

4 participants