Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-33666: [R] Remove extraneous argument to semi_join #33693

Merged
merged 4 commits into from Jan 17, 2023

Conversation

thisisnic
Copy link
Member

@thisisnic thisisnic commented Jan 16, 2023

This PR removes the keep argument from the test for semi_join(), which are causing the unit tests to fail. It also removes the argument suffix argument (which is not part of the dplyr function signature) from the function signature here.

Closes: #33666

@github-actions
Copy link

@github-actions
Copy link

⚠️ GitHub issue #33666 has been automatically assigned in GitHub to PR creator.

@thisisnic
Copy link
Member Author

I'm struggling to work out why the linter is failing this; it works fine for me locally(!)

@thisisnic thisisnic marked this pull request as ready for review January 16, 2023 12:26
@raulcd
Copy link
Member

raulcd commented Jan 16, 2023

I'm struggling to work out why the linter is failing this; it works fine for me locally(!)

There was a PR merged (#33620 (comment)) that broke it on master. I have merged the fix on master if you rebase should be fixed.

Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny things! Thanks!

@@ -82,7 +82,7 @@ test_that("left_join with join_by", {
left_join(
to_join %>%
rename(the_grouping = some_grouping),
join_by(some_grouping == the_grouping)
join_by(some_grouping == the_grouping)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change intentional?

Copy link
Member Author

@thisisnic thisisnic Jan 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the consequence of running the styler.

r/R/dplyr-join.R Outdated
...) {
do_join(x, y, by, copy, suffix, ..., join_type = "LEFT_SEMI")
do_join(x, y, by, copy, ..., join_type = "LEFT_SEMI")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are changes in this file necessary to make the tests pass? (Or do they have another purpose, like perhaps making this function more consistent with other arrow join implementations?)

Copy link
Member Author

@thisisnic thisisnic Jan 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes the API consistent with dplyr, but not necessary to make the tests pass.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with that change; however, I think it should be consistent for all of our join wrappers (anti, left, right, etc.). Currently they all define the suffix argument explicitly (and I think the default value of c(".x", ".y") is necessary and may result in an error if you try to semi_join overlapping columns?).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed this change as the anti_join() signature is also different to dplyr's. I don't quite understand what you mean; please can we discuss this in #33709?

@paleolimbot paleolimbot merged commit fa59051 into apache:master Jan 17, 2023
@ursabot
Copy link

ursabot commented Jan 17, 2023

Benchmark runs are scheduled for baseline = bff9e5e and contender = fa59051. fa59051 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.18% ⬆️0.0%] test-mac-arm
[Finished ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.16% ⬆️0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] fa590511 ec2-t3-xlarge-us-east-2
[Finished] fa590511 test-mac-arm
[Finished] fa590511 ursa-i9-9960x
[Finished] fa590511 ursa-thinkcentre-m75q
[Finished] bff9e5e3 ec2-t3-xlarge-us-east-2
[Finished] bff9e5e3 test-mac-arm
[Finished] bff9e5e3 ursa-i9-9960x
[Finished] bff9e5e3 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

raulcd pushed a commit that referenced this pull request Jan 18, 2023
This PR removes the `keep` argument from the test for `semi_join()`, which are causing the unit tests to fail.  It also removes the argument `suffix` argument (which is not part of the dplyr function signature) from the function signature here.

Closes: #33666

Authored-by: Nic Crane <thisisnic@gmail.com>
Signed-off-by: Dewey Dunnington <dewey@fishandwhistle.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[R] Two tests are failing with to-be-released dplyr dependency
4 participants