Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL query formatting improvements #1752

Merged
merged 10 commits into from Apr 9, 2023
Merged

Conversation

living180
Copy link
Contributor

Description

Various improvements to the code which formats SQL statements for the SQL panel. The majority are performance improvements, with an overall 35% reduction in formatting time on my system. However, there is also an improvement in indentation for queries which use the CASE keyword, and better simplification of queries generated by .count() querysets.

Checklist:

  • I have added the relevant tests for this change.
  • I have added an item to the Pending section of docs/changes.rst.

Because the token values escaped by BoldKeywordFilter are simply
intermediate values and are not directly included in HTML templates,
use Python's html.escape() instead of django.utils.html.escape() to
eliminate the overhead of converting the token values to SafeString.

Also pass quote=False when calling escape() since the token values will
not be used in quoted attributes.
sqlparse's SerializerUnicode filter does a bunch of fancy whitespace
processing which isn't needed because the resulting string will just be
inserted into HTML.  Replace with a simple EscapedStringSerializer that
does nothing but convert the Statement to a properly-escaped string.

In the process stop the escaping within BoldKeywordFilter to have a
cleaner separation of concerns:  BoldKeywordFilter now only handles
marking up keywords as bold, while escaping is explicitly handled by the
EscapedStringSerializer.
Instead of using a regex to elide the select list in the simplified
representation of an SQL query, use an sqlparse filter to elide the
select list as a preprocessing step.  The result ends up being about 10%
faster.
Instead of only eliding select lists longer than 12 characters, now only
elide select lists that contain a dot (from a column expression like
`table_name`.`column_name`).  The motivation for this is that as of
Django 1.10, using .count() on a queryset generates
    SELECT COUNT(*) AS `__count` FROM ...
instead of
    SELECT COUNT(*) FROM ...
queries.  This change prevents the new form from being elided.
@living180 living180 marked this pull request as draft April 2, 2023 12:06
@living180 living180 marked this pull request as ready for review April 2, 2023 12:31
If a query has subselects in its WHERE clause, do not elide the select
lists in those subselects.
The "<strong>" tokens inserted by the BoldKeywordFilter were causing the
AlignedIndentFilter to apply excessive indentation to queries which used
CASE statements.  Fix by rewriting BoldIndentFilter as a statement
filter rather than a preprocess filter, and applying after
AlignedIndentFilter.
When formatting SQL statements using sqparse, grouping only affects the
output when AlignedIndentFilter is applied.
By using a settings_changed signal receiver to clear the query caching,
the parse_sql() and _parse_sql() functions can be merged and the check
for the "PRETTIFY_SQL" setting can be moved back inside the
get_filter_stack() function.
Copy link
Member

@matthiask matthiask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for refactoring the way settings are modified! That's an excellent change.

The code amount increase makes me a bit sad, but I think NOT hiding count(*) queries etc. is definitely a net positive.

Until now we haven't used sqlparse's insert_(before|after) methods. They are documented https://sqlparse.readthedocs.io/en/latest/analyzing/?highlight=insert_before#sqlparse.sql.TokenList.insert_before but I find the idx manipulation and those methods a bit disquieting. The test suite seems to cover the relevant cases, we are depending on sqlparse anyway and it has been quite stable over the years and that's sufficient for me, but I want to wait a bit if anyone else has some reservations about this change before I'll merge it.

Thanks!

@matthiask matthiask merged commit 7b8a6cc into jazzband:main Apr 9, 2023
20 checks passed
@matthiask
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants