New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add context to metrics reporting of buffer-full events #1566
Add context to metrics reporting of buffer-full events #1566
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small suggestion. Otherwise LGTM.
@@ -82,7 +82,7 @@ def on_finish(span) | |||
n = spans.size + 1 - max_queue_size | |||
if n.positive? | |||
spans.shift(n) | |||
report_dropped_spans(n, reason: 'buffer-full') | |||
report_dropped_spans(n, reason: 'buffer-full', context: 'on_finish') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you considered using semconv code.*
for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stupendous idea! 03fb66c
Co-authored-by: Francis Bogsanyi <francis.bogsanyi@shopify.com>
@@ -82,7 +82,7 @@ def on_finish(span) | |||
n = spans.size + 1 - max_queue_size | |||
if n.positive? | |||
spans.shift(n) | |||
report_dropped_spans(n, reason: 'buffer-full') | |||
report_dropped_spans(n, reason => 'buffer-full', OpenTelemetry::SemanticConventions::Trace::CODE_FUNCTION => 'on_finish') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to use hash rocket syntax to have constant as key (afaik)
@@ -204,8 +204,8 @@ def report_result(result_code, batch) | |||
end | |||
end | |||
|
|||
def report_dropped_spans(count, reason:) | |||
@metrics_reporter.add_to_counter('otel.bsp.dropped_spans', increment: count, labels: { 'reason' => reason }) | |||
def report_dropped_spans(count, labels = {}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I changed the method signature, we do not need to change other call sites for report_dropped_spans
since they were using the kwarg reason: 'foo'
which we can also interpret as a hash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 I'd rather we used named arguments consistently, and Strings for keys and values in the labels. I.e. I'd prefer:
def report_dropped_spans(count, labels: nil) ... end
report_dropped_spans(n, labels: { 'reason' => 'buffer-full', 'code.function' => 'force_flush' })
or:
def report_dropped_spans(count, reason:, function: nil)
@metrics_reporter.add_to_counter('otel.bsp.dropped_spans', increment: count, labels: { 'reason' => reason, 'code.function' => function }.compact)
end
report_dropped_spans(n, reason: 'buffer-full', function: 'force_flush')
I prefer the latter, since it is cleaner for callers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, let's do the latter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems completely arbitrary and unrelated to the specification. Also lgtm |
Woo hoo, I just need an adult to merge it 😄 |
Unrelated, yes. Not completely arbitrary. For transparency, the goal is to also enable logging of the |
Also, to be clear, we are dropping spans in production due to |
@@ -222,7 +222,7 @@ def to_span_data | |||
_(test_exporter.failed_batches.size).must_equal(0) | |||
_(test_exporter.batches.size).must_equal(0) | |||
|
|||
_(bsp.instance_variable_get(:@spans).size).must_equal(1) | |||
_(bsp.instance_variable_get(:@spans).size).must_equal(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are now properly dropping spans during shutdown
we change the test expectation
We report
buffer-full
dropped spans in two contexts:on_finish
andforce_flush
. Sinceforce_flush
is used in specific contexts, I thought it would be useful to supply a label so that users can have visibility into that.We could alternatively use the
reason
to disambiguate the two scenarios, rather than introducing a new tag.