Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix psycopg3 tests #1773

Merged
merged 4 commits into from May 12, 2023
Merged

Fix psycopg3 tests #1773

merged 4 commits into from May 12, 2023

Conversation

living180
Copy link
Contributor

@living180 living180 commented May 9, 2023

Description

Several tests (such as SQLPanelTestCase.test_cursor_wrapper_singleton) are written to ensure that only a single cursor wrapper is instantiated during a test. However, this fails when using psycopg3, since the .last_executed_query() call in NormalCursorWrapper._record() ends up creating an additional cursor (via the mogrify() function). To avoid this, use a ._djdt_in_record attribute on the database wrapper. Make the NormalCursorWrapper._record() method set ._djdt_in_record to True on entry and reset it to False on exit. Then in the overridden database wrapper .cursor() and .chunked_cursor() methods, check the ._djdt_in_record attribute and return the original cursor without wrapping if the attribute is True.

Checklist:

  • I have added the relevant tests for this change.
  • I have added an item to the Pending section of docs/changes.rst.

@tim-schilling
Copy link
Contributor

Thank you for diving into this. I kept trying to make time over the weekend, but got sidetracked each time.

Copy link
Member

@matthiask matthiask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Btw, https://github.com/jazzband/django-debug-toolbar/pull/1773/files?w=1 is much easier to review.)

Thanks for diving into this! Tests are finally green again, so that's definitely a good sign.

I have to admit: It's not obvious that the code does the right thing. I'm really wondering if there isn't a scenario where it's possible to pass some object to cursor.execute which fetches its own cursor during the preparation or execution stage of the SQL query in psycopg. Are we sure that there are no legitimate uses of instantiating a new cursor inside execute, executemany and friends? Because if that's the case we should probably change the assertion depending on whether we're using psycopg2 or psycopg3.

(I did some research but I didn't find an answer to this question. I'd certainly be unsure again when revisiting this code in a few months or years.)

@tim-schilling
Copy link
Contributor

@living180 and @matthiask I took a look at this. Originally I was just going to swap out the decorated function with another ContextVar. Then I started digging into the issue and felt like we could reasonably re-use allow_sql here if we allow ExceptionCursorWrapper to access connection and call self.close(). This was achievable (for 4.2 at least) by utilizing Django's CursorWrapper class. And rather than writing an exception case for those two members, I swapped it for an explicit deny list of fields.

I didn't test for other pythons, and djangos. Only 4.2 with psycopg3 and sqlite, we'll see what CI says.

@tim-schilling
Copy link
Contributor

Hot dog! Let me know what you think. It's pretty late for me right now.

@living180
Copy link
Contributor Author

Hey @tim-schilling I definitely like the approach of subclassing Django's CursorWrapper instead of proxying. A couple of suggestions though: I would rename our BaseCursorWrapper to something like DjDTCursorWrapper since it is no longer the actual base class. Second, I think your original idea of using an allow list in ExceptionCursorWrapper to permit .connection and .close() are a better approach than a deny list. With the allow list approach, if an additional attribute is accessed that needs to be allowed, the result will be a very visible exception that will immediately show up in testing and can be addressed. However, with the deny list approach, if there happens to be a CursorWrapper method that does access the DB that isn't denied, it could silently allow the TemplatesPanel._store_template_info() method to access the DB when it shouldn't, and likely wouldn't be noticed for much longer.

That said, I still think it might be better to simply not wrap the cursor created as a result of calling .last_executed_query() instead of trying to force ExceptionCursorWrapper to fit that case, using a similar approach to my original commit.

@living180
Copy link
Contributor Author

I have to admit: It's not obvious that the code does the right thing. I'm really wondering if there isn't a scenario where it's possible to pass some object to cursor.execute which fetches its own cursor during the preparation or execution stage of the SQL query in psycopg. Are we sure that there are no legitimate uses of instantiating a new cursor inside execute, executemany and friends? Because if that's the case we should probably change the assertion depending on whether we're using psycopg2 or psycopg3.

Good thoughts. I updated my commit to only set the ._djdt_in_record attribute after making the original call, and also added some explanatory comments on what the code is doing. I did think about updating the test to allow two NormalCursorWrapper instantiations for psycopg3, but I felt that it would tie the tests too closely to the Django internals: since .last_executed_query() already takes a cursor, it is conceivable that Django could at some point refactor things internally to pass that cursor down to the .mogrify() call instead of creating a second cursor inside of .mogrify().

I updated the PR with my tweaks (https://github.com/jazzband/django-debug-toolbar/compare/d935165..ab42456?w=1#diff-02d0d5eed3aeea22383f91c339593db2c9710ebd2ae09c3fc7623828dcf584cc) but left @tim-schilling's changes on top, despite having some reservations about utilizing ExceptionCursorWrapper to handle this.

@living180
Copy link
Contributor Author

So after looking at the code a bit more, I tried a slightly different approach that builds on no longer undoing the SQL panel monkey patching, similar to the change that I just made to the cache panel: https://github.com/jazzband/django-debug-toolbar/compare/main...living180:django-debug-toolbar:psycopg3_tests_v2?w=1.

@tim-schilling
Copy link
Contributor

I don't agree with the earlier decision with the cache panel and now this to not undo the monkey patching on the functionality. If we want to go that route, we should work through the justification to removing the "disable_instrumentation" method.

To be clear, I'm fine changing how the toolbar works internally, but would like to see that decision to change the toolbar's future made via a discussion.

@tim-schilling
Copy link
Contributor

@living180 Can you make this work via a ContextVar. My justification is that it's easier to reason about in that context than manipulating a private attribute across various functions. self.db._djdt_in_record isn't obvious that self.db our wrapped cursor. If you disagree, please let me know why you feel an attribute on the function is the better approach.

I'd also be curious about your thoughts on moving the setting of that context var as close as possible to last_executed_query() since we know that's what triggers it.

@matthiask
Copy link
Member

I don't agree with the earlier decision with the cache panel and now this to not undo the monkey patching on the functionality. If we want to go that route, we should work through the justification to removing the "disable_instrumentation" method.

To be clear, I'm fine changing how the toolbar works internally, but would like to see that decision to change the toolbar's future made via a discussion.

I'm sorry for merging the change too quickly then.

For the record: I don't think removing the disable_instrumentation method would be a good idea. Let's not get sidetracked here (too much) and continue in another issue if there's much discuss -- I'm happy to participate.

I want to say this though: Reversing some types of monkey patching is brittle, especially when some other code also applies its own monkey patches. I'm sure that the performance hit of an additional method call (added by the instrumentation) is negligible and only really harms those who are already putting themselves in harms way by installing the toolbar in production environments. It's better in the long run to keep the amount of patching back and forth low.

@tim-schilling
Copy link
Contributor

You're right. I was extrapolating too much, going from undoing the monkey patching to removing disable_instrumentation.

It would be nice to find a re-usable pattern/abstraction for this type of logic in these monkey-patched panels.

@living180
Copy link
Contributor Author

I don't agree with the earlier decision with the cache panel and now this to not undo the monkey patching on the functionality. If we want to go that route, we should work through the justification to removing the "disable_instrumentation" method.

I'd love to hear more about your concerns. The reason I went with the approach I used with the cache panel was based on @matthiask's comment: #1769 (comment). And indeed, after looking at how that code worked previously, I realized that in the general case it is not possible to safely undo monkey patching in general: consider if another piece of code monkey patches the cache after our monkey patch is applied. Then if we try to undo our monkey patch, it will also inadvertently undo the other code's monkey patch as well, which is probably not what is intended. Now, I don't know of any case of that happening in practice, so it might be the case that your concerns override this change.

@tim-schilling
Copy link
Contributor

The concern I have is that we would be running code in someone's application when we and the user don't expect the toolbar to be running code. I reviewed both your changes and it looks like this is prevented so my concern is already being addressed.

@living180
Copy link
Contributor Author

@living180 Can you make this work via a ContextVar. My justification is that it's easier to reason about in that context than manipulating a private attribute across various functions. self.db._djdt_in_record isn't obvious that self.db our wrapped cursor. If you disagree, please let me know why you feel an attribute on the function is the better approach.

I suppose it is mainly a matter of preference. A ContextVar is a semi-global variable ("semi-" because it is scoped to the current thread / task), while the attribute is clearly associated with the object it is affecting. I agree that it isn't immediately obvious that the the cursor's .db attribute is the same as the connection variable passed to wrap_cursor(). If we go with the attribute approach, I'm happy to expand the comment to help clarify that. (As an aside, part of the lack of clarity comes from Django's own inconsistent naming scheme, which we (rightfully, IMO) follow in the SQL tracking code. The object with the .cursor() and .chunked_cursor() methods is an instance of Django's DatabaseWrapper, which the Django codebase usually calls connection, but also sometimes db. To further muddy the waters a DatabaseWrapper instance has a connection attribute itself. It was fun sorting through all of that when fixing the Debug Toolbar PostgreSQL transaction tracking code.)

I'd also be curious about your thoughts on moving the setting of that context var as close as possible to last_executed_query() since we know that's what triggers it.

That sounds fine.

@tim-schilling
Copy link
Contributor

I suppose it is mainly a matter of preference. A ContextVar is a semi-global variable ("semi-" because it is scoped to the current thread / task), while the attribute is clearly associated with the object it is affecting.

You've convinced me that my approach may not be the better one. I'm happy with whatever you decide to go with.

Prior to this commit, the SQLPanel would monkey patch the .cursor() and
.chunked_cursor() methods of each DatabaseWrapper connection in
.enable_instrumentation(), and then undo the monkey patch in
.disable_instrumentation().  However, for the same reasons as
a6b65a7, stop trying to undo the monkey
patching in .disable_instrumentation() and simply use a .djdt_logger
attribute on the DatabaseWrapper to determine if the cursors should be
wrapped.
Several tests (such as SQLPanelTestCase.test_cursor_wrapper_singleton)
are written to ensure that only a single cursor wrapper is instantiated
during the test.  However, this fails when using Django's psycopg3
backend, since the .last_executed_query() call in
NormalCursorWrapper._record() ends up creating an additional cursor (via
[1]).  To avoid this wrapping this additional cursor, set the
DatabaseWrapper's ._djdt_logger attribute to None before calling
.last_executed_query() and restore it when finished.  This will cause
the monkey-patched DatabaseWrapper .cursor() and .chunked_cursor()
methods to return the original cursor without wrapping during that call.

[1] https://github.com/django/django/blob/4.2.1/django/db/backends/postgresql/psycopg_any.py#L21
@living180
Copy link
Contributor Author

So I've updated this PR to be based on approach I mentioned above: #1773 (comment). @tim-schilling I borrowed your approach of adding a _last_executed_query() method to encapsulate the process of disabling and then re-enabling the cursor wrapping. I also kept the portion of your commit that changed the Debug Toolbar cursor wrappers to inherit from Django's CursorWrapper.

If there are any further concerns or feedback please do let me know. I will be AFK for the next couple of days but can respond to any comments on Sunday.

This switches the Debug Toolbar cursor wrappers to inherit from the
Django class django.db.backends.utils.CursorWrapper. This reduces some
of the code we need.
@living180
Copy link
Contributor Author

Made one more minor change to avoid wrapping a CursorWrapper in a CursorWrapper: https://github.com/jazzband/django-debug-toolbar/compare/ee46d62e8d6ac34e4884d00a8a768091172b6449..e7575e87dc9e2d2560b87d6fd5a123b9398cbd34

Copy link
Contributor

@tim-schilling tim-schilling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thank you @living180! I'm going to make the comment change because I think there's value in it. Edit: on another review I decided the your version is better.

debug_toolbar/panels/sql/tracking.py Show resolved Hide resolved
@matthiask matthiask merged commit e7575e8 into jazzband:main May 12, 2023
21 checks passed
@matthiask
Copy link
Member

Thank you both! Looks good. Let's continue with it and see if we can roll a new release soon-ish.

@living180 living180 deleted the psycopg3_tests branch May 15, 2023 11:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants