Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ref: Improve scrub_dict typing #2768

Merged
merged 7 commits into from
Mar 11, 2024
24 changes: 11 additions & 13 deletions sentry_sdk/scrubber.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
from typing import cast
szokeasaurusrex marked this conversation as resolved.
Show resolved Hide resolved

from sentry_sdk.utils import (
capture_internal_exceptions,
AnnotatedValue,
Expand All @@ -8,8 +10,6 @@

if TYPE_CHECKING:
from sentry_sdk._types import Event
from typing import Any
from typing import Dict
from typing import List
from typing import Optional

Expand Down Expand Up @@ -66,7 +66,7 @@
self.recursive = recursive

def scrub_list(self, lst):
# type: (List[Any]) -> None
# type: (object) -> None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get where you're coming from (the method theoretically does support any type), but as a user I'd find it confusing that a method called scrub_list accepts an argument lst of any type.

Since types are not enforced at runtime, users are free to ignore the type hint that says a List[Any] should be given to this method, which is why the isinstance(lst, list) check is there now -- just to make sure we don't explode if someone misuses the function. I feel like by changing the type hint to say this function supports any object, we're making it harder for users to understand how this method is meant to be used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need this change for the Event TypedDict changes (#2753). Originally, I was going to include these changes there, but I split these into a separate PR, since they are not strictly related.

We should use the object type here because without this change, the sentry_sdk.scrubber.EventScrubber.scrub_request method raises type errors on the #2753 code. The reason is that event["request"] has a type of dict[str, object], since the request object (at least from my understanding) can contain an arbitrary mapping from strings to any object. We are then calling scrub_dict on several objects within event["request"] (such as event["request"]["headers"]), which according to the type checker have type object. However without this PR, the scrub_dict method is declared as only accepting dict[str, Any], so the type checker fails, since based on the type hint it appears that the code is not type safe.

However, because of the fact that we are doing the isinstance check within the scrub_dict method, the code is in fact type safe. We can pass any object to scrub_dict and the code will behave correctly.

I feel like by changing the type hint to say this function supports any object, we're making it harder for users to understand how this method is meant to be used.

I understand your point, but there is already a doc comment here to explain that the method only does anything if the argument passed is a list, and I will also add a similar comment to the scrub_dict method to clarify.

I prefer object here, since in my opinion, the purpose of type hints is to communicate to users what type they need to pass in order for type safety to be ensured. For these methods, users can pass objects even with an unknown type without violating the method contract, and type safety is always guaranteed. Going back to the scrub_request example that prompted me to make this change, having type object makes it clear that I can safely pass event["request"]["headers"] without first adding an isinstance check to make sure that event["request"]["headers"] is a dictionary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha -- then let's go with your change.

"""
If a list is passed to this method, the method recursively searches the list and any
nested lists for any dictionaries. The method calls scrub_dict on all dictionaries
Expand All @@ -77,24 +77,22 @@
return

for v in lst:
if isinstance(v, dict):
sentrivana marked this conversation as resolved.
Show resolved Hide resolved
self.scrub_dict(v)
elif isinstance(v, list):
self.scrub_list(v)
self.scrub_dict(v)
self.scrub_list(v)

Check warning on line 81 in sentry_sdk/scrubber.py

View check run for this annotation

Codecov / codecov/patch

sentry_sdk/scrubber.py#L80-L81

Added lines #L80 - L81 were not covered by tests

def scrub_dict(self, d):
# type: (Dict[str, Any]) -> None
# type: (object) -> None
szokeasaurusrex marked this conversation as resolved.
Show resolved Hide resolved
if not isinstance(d, dict):
return

for k, v in d.items():
if isinstance(k, string_types) and k.lower() in self.denylist:
# The cast is needed because mypy is not smart enough to figure out that k must be a
# string after the isinstance check.
if isinstance(k, string_types) and cast(str, k).lower() in self.denylist:
d[k] = AnnotatedValue.substituted_because_contains_sensitive_data()
elif self.recursive:
if isinstance(v, dict):
self.scrub_dict(v)
elif isinstance(v, list):
self.scrub_list(v)
self.scrub_dict(v)
self.scrub_list(v)

Check warning on line 95 in sentry_sdk/scrubber.py

View check run for this annotation

Codecov / codecov/patch

sentry_sdk/scrubber.py#L94-L95

Added lines #L94 - L95 were not covered by tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above regarding removing the isinstance checks here.


def scrub_request(self, event):
# type: (Event) -> None
Expand Down