Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WHERE clause eats custom Keyword but not others #750

Open
devinnasar opened this issue Nov 14, 2023 · 0 comments
Open

WHERE clause eats custom Keyword but not others #750

devinnasar opened this issue Nov 14, 2023 · 0 comments

Comments

@devinnasar
Copy link

Describe the bug
I'm attempting to define some custom keywords for the dialect I'm using, specifically the Keyword FACET. I'm noticing that when FACET appears after WHERE in my querystring, sqlparse groups FACET into the WHERE's TokenList. However, if another Keyword such as GROUP BY appears between WHERE and FACET, then WHERE, GROUP BY, and FACET all show up in the top level of the parsed tree as expected. I'm trying to understand why this is the case and if I'm doing something wrong.

To Reproduce

import re

import sqlparse
from sqlparse import keywords
from sqlparse.lexer import Lexer
from pprint import pprint

# get the lexer singleton object to configure it
lex = Lexer.get_default_instance()

# Clear the default configurations.
# After this call, reg-exps and keyword dictionaries need to be loaded
# to make the lexer functional again.
lex.clear()

my_regex = (r"FACET\b", sqlparse.tokens.Keyword)

# slice the default SQL_REGEX to inject the custom object


sql_regexes = keywords.SQL_REGEX[:38] + [my_regex] + keywords.SQL_REGEX[38:]

lex.set_SQL_REGEX(sql_regexes)

# pprint([(str(sql_regex[0]), sql_regex[1]) for sql_regex in sql_regexes])

pprint(keywords.KEYWORDS)
# add the default keyword dictionaries
lex.add_keywords(keywords.KEYWORDS_COMMON)
lex.add_keywords(keywords.KEYWORDS_ORACLE)
lex.add_keywords(keywords.KEYWORDS_PLPGSQL)
lex.add_keywords(keywords.KEYWORDS_HQL)
lex.add_keywords(keywords.KEYWORDS_MSACCESS)
lex.add_keywords(keywords.KEYWORDS)

# add a custom keyword dictionary

lex.add_keywords({'FACET': sqlparse.tokens.Keyword})

querystring = "SELECT x, latest(y) FROM (SELECT x, y, z FROM table) WHERE x = 5 FACET x.name"

# no configuration is passed here. The lexer is used as a singleton.
parsed = sqlparse.parse(querystring)

pprint(parsed[0].tokens)

# produces:
# 
# [<DML 'SELECT' at 0x104A934C0>,
#  <Whitespace ' ' at 0x104A93820>,
#  <IdentifierList 'x, lat...' at 0x104D1A450>,
#  <Whitespace ' ' at 0x104CEEF80>,
#  <Keyword 'FROM' at 0x104CEEFE0>,
#  <Whitespace ' ' at 0x104CEF040>,
#  <Parenthesis '(SELEC...' at 0x104D19DD0>,
#  <Whitespace ' ' at 0x104CEF640>,
#  <Where 'WHERE ...' at 0x104D19ED0>]

# However, when querystring is
# querystring = "SELECT x, latest(y) FROM (SELECT x, y, z FROM table) WHERE x = 5 GROUP BY y FACET x.name"
# 
# produces:
# 
# [<DML 'SELECT' at 0x1002674C0>,
#  <Whitespace ' ' at 0x100267820>,
#  <IdentifierList 'x, lat...' at 0x1004EE3D0>,
#  <Whitespace ' ' at 0x1004C2F80>,
#  <Keyword 'FROM' at 0x1004C2FE0>,
#  <Whitespace ' ' at 0x1004C3040>,
#  <Parenthesis '(SELEC...' at 0x1004EDD50>,
#  <Whitespace ' ' at 0x1004C3640>,
#  <Where 'WHERE ...' at 0x1004EDE50>,
#  <Keyword 'GROUP ...' at 0x1004C39A0>,
#  <Whitespace ' ' at 0x1004C3A00>,
#  <Identifier 'y' at 0x1004EE2D0>,
#  <Whitespace ' ' at 0x1004C3AC0>,
#  <Keyword 'FACET' at 0x1004C3B20>,
#  <Whitespace ' ' at 0x1004C3B80>,
#  <Identifier 'x.name' at 0x1004EDED0>]

Expected behavior
I would expect that the latter output would appear regardless of where FACET appears in my query.

Versions (please complete the following information):

  • Python: 3.11
  • sqlparse: 0.4.4

Additional context
Is it possible that I'm simply not understanding something about sqlparse? Is there a reason why WHERE would consume my custom Keyword and not others?

@devinnasar devinnasar changed the title WHERE clause consumes custom Keyword WHERE clause eats custom Keyword but not others Nov 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant