Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(rust,python,cli): support REGEXP and RLIKE pattern matching in SQL engine #13359

Merged
merged 1 commit into from Jan 2, 2024

Conversation

alexander-beedie
Copy link
Collaborator

@alexander-beedie alexander-beedie commented Jan 1, 2024

  • Adds polars SQL engine support for REGEXP and RLIKE pattern matching (functionally identical to each other).
  • Minor tidy-up; inconsistent use of "Sql" vs "SQL" in the crate's object/func names - standardised on "SQL".
  • Bonus unit test coverage for CASE … WHEN … ELSE … END and COALESCE.

Examples

Literal pattern match:

lf = pl.LazyFrame({
    "idx": [0, 1, 2, 3, 4],
    "val": ["ABC", "abc", "000", "A0C", "a0c"],
})
with pl.SQLContext(test_data=lf, eager_execution=True) as ctx:
    print( ctx.execute("SELECT * FROM test_data WHERE val NOT REGEXP '.*c$'") )

# shape: (3, 2)
# ┌─────┬─────┐
# │ idx ┆ val │
# │ --- ┆ --- │
# │ i64 ┆ str │
# ╞═════╪═════╡
# │ 0   ┆ ABC │
# │ 2   ┆ 000 │
# │ 3   ┆ A0C │
# └─────┴─────┘

Expression pattern match (new: not available through the existing ~, ~*, etc regex ops):

lf = pl.LazyFrame({
    "idx": [0, 1, 2, 3, 4],
    "val": ["ABC", "abc", "000", "A0C", "a0c"],
    "pat": ["^A", "^A", "^A", r"[AB]\d.*$", ".*xxx$"],
})
with pl.SQLContext(test_data=lf, eager_execution=True) as ctx:
    print( ctx.execute(f"SELECT idx,val FROM test_data WHERE val REGEXP pat") )

# shape: (2, 2)
# ┌─────┬─────┐
# │ idx ┆ val │
# │ --- ┆ --- │
# │ i64 ┆ str │
# ╞═════╪═════╡
# │ 0   ┆ ABC │
# │ 3   ┆ A0C │
# └─────┴─────┘

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars labels Jan 1, 2024
@alexander-beedie alexander-beedie added the A-sql Area: Polars SQL functionality label Jan 1, 2024
@alexander-beedie alexander-beedie changed the title feat(rust,python,cli): support REGEXP and RLIKE operators in SQL engine feat(rust,python,cli): support REGEXP and RLIKE operators in SQL engine Jan 1, 2024
@alexander-beedie alexander-beedie changed the title feat(rust,python,cli): support REGEXP and RLIKE operators in SQL engine feat(rust,python,cli): support REGEXP and RLIKE pattern matching in SQL engine Jan 1, 2024
@ritchie46 ritchie46 merged commit acb0afc into pola-rs:main Jan 2, 2024
26 checks passed
@alexander-beedie alexander-beedie deleted the sql-regexp-and-rlike-ops branch January 2, 2024 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql Area: Polars SQL functionality enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants