Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG/PERF: merge_asof raising TypeError for various "by" column dtypes #55678

Merged
merged 6 commits into from Oct 25, 2023

Conversation

lukemanley
Copy link
Member

Follow up to #55670 which was a targeted regression fix for datelike dtypes.

merge_asof currently raises if the dtype of by is something other than int64, uint64, or object. This PR removes that limitation.

| Change   | Before [e48df1cf] <main>   | After [a53a5533] <merge-asof-by-dtypes>   |   Ratio | Benchmark (Parameter)                                |
|----------|----------------------------|-------------------------------------------|---------|------------------------------------------------------|
| -        | 299±30ms                   | 199±20ms                                  |    0.67 | join_merge.MergeAsof.time_multiby('backward', 5)     |
| -        | 311±30ms                   | 202±20ms                                  |    0.65 | join_merge.MergeAsof.time_multiby('backward', None)  |
| -        | 292±20ms                   | 158±20ms                                  |    0.54 | join_merge.MergeAsof.time_by_object('forward', None) |
| -        | 302±30ms                   | 157±10ms                                  |    0.52 | join_merge.MergeAsof.time_by_object('forward', 5)    |
| -        | 411±10ms                   | 200±5ms                                   |    0.49 | join_merge.MergeAsof.time_multiby('forward', 5)      |
| -        | 420±30ms                   | 202±20ms                                  |    0.48 | join_merge.MergeAsof.time_by_object('nearest', None) |
| -        | 457±20ms                   | 215±8ms                                   |    0.47 | join_merge.MergeAsof.time_multiby('forward', None)   |
| -        | 515±10ms                   | 241±6ms                                   |    0.47 | join_merge.MergeAsof.time_multiby('nearest', 5)      |
| -        | 519±10ms                   | 242±4ms                                   |    0.47 | join_merge.MergeAsof.time_multiby('nearest', None)   |
| -        | 419±40ms                   | 185±20ms                                  |    0.44 | join_merge.MergeAsof.time_by_object('nearest', 5)    |
| -        | 159±20ms                   | 64.5±8ms                                  |    0.41 | join_merge.MergeAsof.time_by_int('backward', 5)      |
| -        | 154±10ms                   | 63.5±6ms                                  |    0.41 | join_merge.MergeAsof.time_by_int('backward', None)   |
| -        | 437±20ms                   | 123±20ms                                  |    0.28 | join_merge.MergeAsof.time_by_int('nearest', 5)       |
| -        | 328±30ms                   | 83.9±3ms                                  |    0.26 | join_merge.MergeAsof.time_by_int('forward', None)    |
| -        | 470±30ms                   | 124±10ms                                  |    0.26 | join_merge.MergeAsof.time_by_int('nearest', None)    |
| -        | 324±30ms                   | 81.7±3ms                                  |    0.25 | join_merge.MergeAsof.time_by_int('forward', 5)       |

@lukemanley lukemanley added Bug Performance Memory or execution speed performance Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Oct 25, 2023
@lukemanley lukemanley added this to the 2.2 milestone Oct 25, 2023
@mroeschke mroeschke merged commit 2f4c93e into pandas-dev:main Oct 25, 2023
36 of 39 checks passed
@mroeschke
Copy link
Member

Thanks @lukemanley

@lukemanley lukemanley deleted the merge-asof-by-dtypes branch November 16, 2023 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Performance Memory or execution speed performance Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

merge_asof can't handle floats in by column?
2 participants