-
Notifications
You must be signed in to change notification settings - Fork 869
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability for SpanProcessor to mutate spans on end #4024
base: main
Are you sure you want to change the base?
Add ability for SpanProcessor to mutate spans on end #4024
Conversation
Co-authored-by: jack-berg <34418638+jack-berg@users.noreply.github.com>
What is the benefit of To my understanding both changes are breaking for |
One benefit is that while order of span processor registration matters for
We're working out the details of that in #4030, but I disagree that adding a new method to an SDK plugin interface is a breaking change. Languages vary in terms of how they expose these plugin interfaces and the language features for evolving interfaces, but there it should be possible for all SDKs to accommodate this type of change, albeit with some creativity in some cases. For example, its simple in java to add a new default method to an interface which existing implementers don't need to implement. This doesn't exist in some language like go, but there you can offer a new |
@JonasKunz I am curious to know whether you think it is essential for the on-end handler being discussed in this PR to make changes that are visible to other processors. In my opinion, it would be better if the changes made on-end applied to only the exporter associated with the processor making the change. In order to satisfy the discussion in #4010, I feel that we should not call the object passed to the on-end handler a "read write span". The span object which is newly live when passed to the OnStart handler is a real span ("writable") and it is a span data object ("readable"). I do not think that processors should receive a real span on end (with which they could use real span APIs, for example, e.g., place the span's context back into a context and invoke a new child span). I think it makes sense for span processors to receive span data objects on-end (i.e., not real spans, but data objects). If at that point a processor wants to modify the span, they can modify the span data object which they pass to their own exporter, but their changes will not impact other processor/exporters. |
I agree that, in the spirit of keeping consistency around the expected behavior of the SDK, it does make sense to follow a similar approach for whether to make processors affect other processors vs only their associated exporter. And I think #4010 has good points on it, to be honest, the existing behavior is quite strange so I think it's great that it's being revisited there. However, I'm wondering if this PR is the right place to have this discussion, as it seems to affect SpanProcessors in a broader way, given that, should we decide in #4010 to switch to processors with independent pipelines, wouldn't it mean that (for SpanProcessors) we should address My understanding is that this PR is focusing on adding some extra functionality while keeping consistency with the existing patterns used in the rest of |
I think there are usecases for both variants
I guess something like a processing pipeline would be more flexible to allow putting one processor after the other. There are already SpanProcessors which mutate in |
@jmacd In my opinion, this would remove the benefits this PR is intending to add: Making it easier for users to enrich spans in span processors before they are exported. There is already a workaround for doing this with the existing So the problem I see with making changes in My take here is that there are two viable classes of solutions for allowing easy enrichment of spans via the SDK:
This PR specs out solution category 1. We can definitely revisit this decision! However, if we do go for a pipelining approach, I'd propose to rather enhance the However, if we go for the pipeline approach, I'd propose to separate the construction of processors from the assembly of the pipeline by either:
The reason why I'm suggesting is that if we leave the pipeline assembly to the user via decoration/wrapping, this makes the pipeline structure a blackbox for the SDK. If instead the pipeline assembly is managed by the SDK, it has more control over it: e.g. it could insert logging or telemetry between the pipeline stages if necessary. It also plays much better with autoconfiguration.
I'm not sure whether |
Co-authored-by: jack-berg <34418638+jack-berg@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @JonasKunz. I understand the motivation and agree with this change. Also, you've helped me understand how the change I'm looking for can be made orthogonally.
I added suggestions because I think OnEnd()
becomes less-well specified without adding more text, given the new and similarly-named OnEnding()
. Presently, nothing is said about the execution model for OnEnd()
--for example whether or not the first call to OnEnd()
is required to return before the second processor's OnEnd()
is called.
Reading into the OnEnd()
text below your changes here, I find that lines 613/614 (in your change, lines 592/593 prior) are problematic, given the question posed in #4010. The OnEnd()
callback (which I'd like to be named OnExport()
) should not receive a mutable Span, so "modifying is not allowed" is redundant. For us to add pipeline capabilities, the OnEnd()
callback should be allowed to change the data, which is distinct from modifying it -- only its own exporter would see the changes.
Co-authored-by: Joshua MacDonald <jmacd@users.noreply.github.com>
@jmacd so if my understanding is correct, you are planning to add some kind of pipelining / chaining capabilities to I wonder if the addition of such pipelines would render the The main differences in terms of capabilities between
So what do you think? Should we move ahead with this PR and get it merged or should we wait and check the overlap with the pipeline approach you are coming up with? |
@JonasKunz I assumed that you mean for
The requirements, as written, I think ensure that callers are not permitted to use the span reference after
I'd probably tack on "SHOULD be reflected in it until End() is called on the Span". For the last two bullets, I don't see a problem. I expect the OnEnd() callbacks all to execute before the export begins (a.k.a. OnEnd()). Nothing is stated about when the OnEnd happens, but after your change it should be clear that pipelining effects (whatever they are) begin with the OnEnd call. I think OnEnding() makes sense the way you have it -- and there are real use cases so we don't need to block for future design work. |
Discussed this in the context of #4062 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pellared Would you say that if #4030 is accepted, this PR should be approached differently? (To me, it seems like the answer is yes.) I've speculated about what the solution might look like in #4062 (review), which is briefly for SDKs to support a "FanoutProcessor" that gives users control over whether mutations are private to their export pipeline and/or visible to the next processor. |
@@ -580,6 +582,25 @@ exceptions. | |||
|
|||
**Returns:** `Void` | |||
|
|||
#### OnEnding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should mark that this method as optional as the spec is already stable and existing SDKs may not want to immediately support this (even when this extension point becomes stable)
Fixes #1089.
In addition to the comments on the issue, this was discussed in the spec SIG Meeting on 2024/23/04:
SpanProcessor
s due to better conceptual fitSpanProcessor
s in a chaining fashion during the initial SDK spec design and it was actively decided against it. However, no one could recall the reason why.Based on this input, I decided to not move the chaining-based solution forward and stay with the original proposal of adding a new callback to be invoked just before a span is ended.
I also decided to name the new callback
OnEnding
instead ofBeforeEnd
as suggested in this comment. The nameOnEnding
is more precise about when the callback is invoked.A big discussion point in the SIG Meeting on 2024/23/04 also was the problem of evolving SDK plugin interfaces without breaking backwards compatibility. This PR contains a proposal to clarify how this should be handled: If the language allows it, interfaces should be directly extended. If not possible, implementations will need to introduce new interfaces and accept them in addition to the existing ones for backwards compatibility. I feel like this allow every language to implement changes to interfaces in the best way possible. Of course, changes to interfaces should still be kept to a necessary minimum.
I also wasn't sure whether this change warrants an addition to the spec-compliance-matrix, so I've left it out for now.
Please leave a comment if you think this is required, then I'll add it.
Changes
Adds a new
OnEnding
callback toSpanProcessor
Add a paragraph on clarifying how languages should deal with interface extensions
Related issues Add BeforeEnd to have a callback where the span is still writeable #1089
Related OTEP(s) #Links to the prototypes (when adding or changing features)
beforeEnd
in that PoC)CHANGELOG.md
file updated for non-trivial changesspec-compliance-matrix.md
updated if necessary