Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for DuckDB struct literal syntax #1194

Merged
merged 4 commits into from
Mar 29, 2024

Conversation

gstvg
Copy link
Contributor

@gstvg gstvg commented Mar 26, 2024

Fixes #1129

Existing Struct expression supports the BigQuery semantics, which allow for optionally typed fields, that if typed, also have an optional name.
DuckDB has incompatible semantics: only named fields without any type specification.

Should I create a separate expression, perharps NamedFieldsStruct, or change the Struct to have an inner enum Struct(enum { MaybeTyped(...), Named(...) }) ?

Thanks!

src/ast/mod.rs Outdated Show resolved Hide resolved
@jayzhan211
Copy link
Contributor

jayzhan211 commented Mar 27, 2024

You can ask the question in issue. The PR draft doesn't send the notification.

Let me reference the PR apache/datafusion#9743

@coveralls
Copy link

coveralls commented Mar 27, 2024

Pull Request Test Coverage Report for Build 8480512054

Details

  • 74 of 74 (100.0%) changed or added relevant lines in 3 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.04%) to 87.92%

Totals Coverage Status
Change from base Build 8411261926: 0.04%
Covered Lines: 20560
Relevant Lines: 23385

💛 - Coveralls

@alamb
Copy link
Collaborator

alamb commented Mar 27, 2024

Should I create a separate expression, perharps NamedFieldsStruct, or change the Struct to have an inner enum Struct(enum { MaybeTyped(...), Named(...) }) ?

I think given the difference in syntax, having a new struct like NamedFieldsStruct would be better / less confusing / easier to handle

Thank you again for working on this @gstvg and @jayzhan211

@gstvg gstvg marked this pull request as ready for review March 29, 2024 00:01
@gstvg
Copy link
Contributor Author

gstvg commented Mar 29, 2024

thanks @jayzhan211 and @alamb
I created a new Expr named Dictionary, which currently only supports struct literals.

I tried adding supporting for DuckDB MAP too, but since map keys can be any expression, it conflicted with Snowflake-only JsonAccess with colon json_field:key, which is currently parsed for all dialects. Restricting it to Snowflake solves the problem. Should i push it here PR too?

@jayzhan211
Copy link
Contributor

jayzhan211 commented Mar 29, 2024

thanks @jayzhan211 and @alamb I created a new Expr named Dictionary, which currently only supports struct literals.

I tried adding supporting for DuckDB MAP too, but since map keys can be any expression, it conflicted with Snowflake-only JsonAccess with colon json_field:key, which is currently parsed for all dialects. Restricting it to Snowflake solves the problem. Should i push it here PR too?

I don't understand what you mean by Snowflake-only JsonAccess with colon json_field:key, can you elaborate it more.
It seems the difference between struct and map in Duckdb is that the key_type can be different for map in each row. They are both json-like objects. Snowflake map's key type should be string or number, nested types for key is not allowed

which is currently parsed for all dialects

I think we have not yet supported any.

@gstvg
Copy link
Contributor Author

gstvg commented Mar 29, 2024

Sure. I would like to solve all this at once, to be honest.

Snowflake JSON access uses : to query data

image
https://docs.snowflake.com/en/user-guide/json-basics-tutorial-query

DuckDB maps key can be any type, not only strings, but may have different set of keys for each row

image
https://duckdb.org/docs/sql/data_types/map.html

To support any type of key, I used Parser::parse_expr instead of parse_identifier

Currently, because of the following arms in parse_infix and get_next_precedence, when parse_expr is called to get the key expr, it parses the whole key: value as a JsonAccess

If those two arms are moved to the Snowflake dialect, or conditioned to a dialect_of!(Snowflake), parse_expr will return the key only, not the whole key value, but the json_column:json_key would be restricted to the Snowflake dialect

I didn't find any other way to do that

I've pushed it to another branch if you have interest

which is currently parsed for all dialects

I think we have not yet supported any.

Sorry, I mean the json_column:json_key is parsed for all dialects, not restricted to the Generic and/or Snowflake

@gstvg
Copy link
Contributor Author

gstvg commented Mar 29, 2024

Others dialects that uses this syntax:

Snowflake
image
Syntax limited to Object constants, so parse_value or parse_identifier can be called instead of parse_expr

ClickHouse
image
Has the same conflicts as DuckDB map, which also means it has the same solution 😀

@jayzhan211
Copy link
Contributor

jayzhan211 commented Mar 29, 2024

How about considering Expr::JsonAcess as your Expr::DictionaryField, it seems to me they are equivalent but the json operator is colon. Similar to parsing [e1, e2...] but {e1, e2...} instead.

Expr::Dictionary(Vec<Expr>), where we can get Expr::JsonAccess (left colon right)

@jayzhan211
Copy link
Contributor

How about considering Expr::JsonAcess as your Expr::DictionaryField, it seems to me they are equivalent but the json operator is colon. Similar to parsing [e1, e2...] but {e1, e2...} instead.

Expr::Dictionary(Vec<Expr>), where we can get Expr::JsonAccess (left colon right)

I met the issue while trying out my idea #1197

Copy link
Collaborator

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good to me: the use of parse_expr and restricting it to DuckDB makes sense to me

If/When people need support for this type in Snowflake and the other dialects, I think we can handle them separately

@alamb
Copy link
Collaborator

alamb commented Mar 29, 2024

I also pushed d9d16ce to fix the doc test failues

@alamb
Copy link
Collaborator

alamb commented Mar 29, 2024

@jayzhan211 does this change look good for you?

@jayzhan211
Copy link
Contributor

I think either struct and map and other dictionaries like syntax {"key": "value"} is possible to share the same parsing rule.
It is fine to do it specifically for now, but I think we can share the similarity for other dialects or syntax later on.

@alamb alamb merged commit e747c9c into sqlparser-rs:main Mar 29, 2024
10 checks passed
@alamb
Copy link
Collaborator

alamb commented Mar 29, 2024

Thanks @jayzhan211 and @gstvg !

@gstvg
Copy link
Contributor Author

gstvg commented Mar 29, 2024

How about considering Expr::JsonAcess as your Expr::DictionaryField, it seems to me they are equivalent but the json operator is colon. Similar to parsing [e1, e2...] but {e1, e2...} instead.

Expr::Dictionary(Vec<Expr>), where we can get Expr::JsonAccess (left colon right)

Good idea. But making Dictionary a Vec<Expr> wouldn't be unergonomic for users? Extending it, what about something like this:

    fn parse_duckdb_dictionary_field(&mut self) -> Result<DictionaryField, ParserError> {
        if let Expr::JsonAccess { left, operator: JsonOperator::Colon, right } = self.parse_expr()? {
            Ok(DictionaryField {
                key: left,
                value: right,
            })
        } else {
            self.expected(":", self.peek_token())
        }
    }

@jayzhan211
Copy link
Contributor

How about considering Expr::JsonAcess as your Expr::DictionaryField, it seems to me they are equivalent but the json operator is colon. Similar to parsing [e1, e2...] but {e1, e2...} instead.
Expr::Dictionary(Vec<Expr>), where we can get Expr::JsonAccess (left colon right)

Good idea. But making Dictionary a Vec<Expr> wouldn't be unergonomic for users? Extending it, what about something like this:

    fn parse_duckdb_dictionary_field(&mut self) -> Result<DictionaryField, ParserError> {
        if let Expr::JsonAccess { left, operator: JsonOperator::Colon, right } = self.parse_expr()? {
            Ok(DictionaryField {
                key: left,
                value: right,
            })
        } else {
            self.expected(":", self.peek_token())
        }
    }

It looks good to me.

JichaoS pushed a commit to luabase/sqlparser-rs that referenced this pull request May 7, 2024
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

support more dialects of struct creation
4 participants