Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot start a span with specified traceId #4671

Open
dan-cooke opened this issue May 2, 2024 · 1 comment
Open

Cannot start a span with specified traceId #4671

dan-cooke opened this issue May 2, 2024 · 1 comment
Assignees
Labels
question User is asking a question not related to a new feature or bug

Comments

@dan-cooke
Copy link

What happened?

Steps to Reproduce

I have a python service and a node service. they communciate via Redis pub/sub

  1. The node service starts a trace and sends it over redis as so
  @Span('DocumentGenerationService.generate')
  async generate(id: number, orgId: string) {
    const span = this.traceService.getSpan()
    const traceId = span.spanContext().traceId;
    console.log("1. Sending TraceID:", traceId);

    this.redis
      .emit('example', { traceId )
      .subscribe({
        error: (err) => console.error(err),
      });
  }

Note: the decorator is from nestjs-otel but it seemingly just starts a span using the default tracer

  1. The python service listens on this channel and sends it back
    def handle(self, message: str):
        data = json.loads(message)["data"]
        trace_id_string = data["traceId"]

        print(f"2. Received  traceId: {trace_id_string}")

        span_id = id.generate_span_id()

        span_context = SpanContext(
            int(trace_id_string[:16], 16), span_id=span_id, is_remote=True
        )
        ctx = trace.set_span_in_context(trace.NonRecordingSpan(span_context))

        with tracer.start_as_current_span("python_example", context=ctx):
### --- leaving the rest out as its not important
  1. Now my node service listens for this reponse and logs out the trace id
  async example(payload: DocgenResponsePayload) {
    console.log('3. node received traceId: ', payload.traceId);
    const traceIdKey = createContextKey('traceId');
    const doc = await this.documentService.findOne(documentId, orgId);

    const ctx = trace.setSpan(
      context.active().setValue(traceIdKey, payload.traceId),
      this.traceService.getSpan()
    );
    const tracer = trace.getTracer('default');
    const span = tracer.startSpan('onAnswerRulePromptsComplete', {}, ctx);
    console.log('Span context traceId ', span.spanContext().traceId);
}

Now its worth noting i have tired maybe 10 different variations here using all sorts of boilerplate like context.with startActiveSpan etc. etc. and nothing works

Expected Result

All the logs should print the same traceId

Actual Result

1, 2 and 3 print the same traceId => therefore the spans are working on the python side, and I have correctly received the same traceId back

But 4 is printing a new traceId, so when I start a new span on the node service, it will not use the passed traceId in the context

Additional Details

I have been on and off with this for over 2 weeks and I cannot get my node service to work, the documentation for this is extremely confusing and I'm having to dig into ancient github issues to find any examples of this.

I realise most people use a Propagator, but I don't think that will work for redis pub/sub

OpenTelemetry Setup Code

No response

package.json

No response

Relevant log output

No response

@dan-cooke dan-cooke added bug Something isn't working triage labels May 2, 2024
@dyladan
Copy link
Member

dyladan commented May 15, 2024

This is the incorrect line context.active().setValue(traceIdKey, payload.traceId),

The trace id alone is insufficient to create a parent context, which requires a trace id and a span id. Interacting with context like this directly using keys is not recommended and is likely to cause issues like what you are experiencing here. Yes, you should be using a propagator for this.

@dyladan dyladan added question User is asking a question not related to a new feature or bug and removed bug Something isn't working triage labels May 15, 2024
@dyladan dyladan self-assigned this May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question User is asking a question not related to a new feature or bug
Projects
None yet
Development

No branches or pull requests

2 participants