-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: OpenTelemetry trace/spanID integration for Python handlers #889
Conversation
…pis/python-logging into otel-span-support-python
cloud_logger.warning(LOG_MESSAGE) | ||
|
||
entries = _list_entries(logger) | ||
self.assertEqual(len(entries), 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to consider instrumentation source entry here? http://go/cdpe-ops-logentry-changes-source
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whether or not the instrumentation source entry gets added depends on whether or not the global variable google.cloud.logging_v2._instrumentation_emitted
is true (see
_instrumentation_emitted = False |
) = get_request_data() | ||
|
||
# otel_trace_id existing means the other return values are non-null | ||
if otel_trace_id: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is no http request data, do we reuse the http_request from last request for otel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is no http request data I have it set to null value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't look too close at tests but LGTM from an OTel perspective
setup.py
Outdated
@@ -44,6 +44,7 @@ | |||
"google-cloud-audit-log >= 0.1.0, < 1.0.0dev", | |||
"google-cloud-core >= 2.0.0, <3.0.0dev", | |||
"grpc-google-iam-v1 >=0.12.4, <1.0.0dev", | |||
"opentelemetry-api >= 1.22.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would consider using an older version here. The context APIs you are I believe were available in even the first 1.x release
def get_request_and_trace_data(): | ||
"""Helper to get http_request and trace data from supported web | ||
frameworks (currently supported: Flask and Django), as well as OpenTelemetry. Attempts | ||
to parse trace/spanID from OpenTelemetry first, before going to Traceparent then XCTC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit this is a bit misleading
to parse trace/spanID from OpenTelemetry first, before going to Traceparent then XCTC. | |
to retrieve trace/spanID from OpenTelemetry first, before going to Traceparent then XCTC. |
@@ -191,9 +193,31 @@ def _parse_xcloud_trace(header): | |||
return trace_id, span_id, trace_sampled | |||
|
|||
|
|||
def _parse_current_open_telemetry_span(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit i wouldn't use "parse" here
tests/system/test_system.py
Outdated
processor = BatchSpanProcessor(ConsoleSpanExporter()) | ||
provider.add_span_processor(processor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can omit these lines if you don't actually want any console output
tests/system/test_system.py
Outdated
processor = BatchSpanProcessor(ConsoleSpanExporter()) | ||
provider.add_span_processor(processor) | ||
|
||
tracer = trace.get_tracer("test_system", tracer_provider=provider) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit this is a bit simpler
tracer = trace.get_tracer("test_system", tracer_provider=provider) | |
tracer = provider.get_tracer("test_system") |
tests/unit/handlers/__init__.py
Outdated
with mock.patch("opentelemetry.trace.get_current_span", return_value=span) as m: | ||
yield m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't actually need to do any mocking by just setting the span in the real context implementation, see this example https://opentelemetry.io/docs/languages/python/cookbook/#manually-setting-span-context:~:text=%23%20Or%20you%20can,(token)
with mock.patch("opentelemetry.trace.get_current_span", return_value=span) as m: | |
yield m | |
ctx = trace.set_span_in_context(span) | |
token = context.attach(ctx) | |
try: | |
yield | |
finally: | |
context.detach(token) |
I am guessing that's why you are using import opentelemetry.trace
instead of from opentelemetry import trace
@@ -211,3 +235,37 @@ def get_request_data(): | |||
return http_request, trace_id, span_id, trace_sampled | |||
|
|||
return None, None, None, False | |||
|
|||
|
|||
def get_request_and_trace_data(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be a new function? The docstrings of get_request_data
says "Helper to get http_request and trace data from supported web frameworks". It seems to me like this logic should just be merged into the existing one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was concerned that by changing the function name that it could potentially be a breaking change. It should be OK to change the function name because it's in a private module, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the module is private, so it should be safe
Does the function need to be renamed though? Why not add new functionality to the existing function?
@@ -662,6 +665,38 @@ def test_log_root_handler(self): | |||
self.assertEqual(len(entries), 1) | |||
self.assertEqual(entries[0].payload, expected_payload) | |||
|
|||
def test_log_handler_otel_integration(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to also add a system test without the otel sdk imported?
I'm not exactly sure what that should look like, but I want to make sure we have coverage of the situation where otel isn't used at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to create another file for the newly added system tests, so that the existing TCs aren't importing otel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you can add an extra file. Or just import within the test functions. Or use a test class. Whatever seems cleanest
Does Otel behave any differently based on having the module installed? Or is import state the important part?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually managed to resolve this by creating a decorator that deletes the otel SDK imports after the test case gets run. I don't think it's perfect but it feels good enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok great! Do you have any test cases that test _retrieve_current_open_telemetry_span
without otel included?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider moving the logic in get_request_and_trace_data into get_request_data. I dont think a new function is required
Other than that, LGTM
Changes made:
opentelemetry-api
as a dependency for the library.get_request_data
toget_request_and_trace_data
get_request_and_trace_data
extracts and returns OpenTelemetry span context information, if a valid span exists.get_request_and_trace_data
, as well as for bothCloudLoggingHandler
andStructuredLogHandler
opentelemetry-sdk
as a system test external dependency.