Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sort: Rework merge batching logic #6957

Merged
merged 1 commit into from
Dec 21, 2024

Conversation

karlmcdowall
Copy link
Contributor

@karlmcdowall karlmcdowall commented Dec 13, 2024

Fixes #6944
Rework the way batching is done with sort such that it doesn't open more input files than necessary. Previously, the code would always open one extra input file which causes problems in ulimit scenarios. Add additional test case.

@karlmcdowall karlmcdowall force-pushed the sort_merge_chunking_rework branch from 872bd2a to b42d0e4 Compare December 14, 2024 01:58
@sylvestre
Copy link
Contributor

Cool
seems that some jobs are failing

@karlmcdowall karlmcdowall force-pushed the sort_merge_chunking_rework branch from b42d0e4 to a97ca59 Compare December 14, 2024 21:28
@karlmcdowall
Copy link
Contributor Author

I fixed the windows compilation issue hopefully (I don't have a Windows setup to test on though), but the cargo deny advisory failure is a mystery to me.
I see the cargo deny issue when I checkout the mainline too so I don't think that's an issue with my changes - if anyone has some insight into that then please let me know!

@karlmcdowall karlmcdowall force-pushed the sort_merge_chunking_rework branch 2 times, most recently from b9e1196 to e888500 Compare December 15, 2024 01:36
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

@sylvestre
Copy link
Contributor

sylvestre commented Dec 15, 2024

Any idea why it doesn't fix the GNU tests about this ?
thanks


2024-12-15T08:38:54.0704735Z FAIL: tests/sort/sort-merge-fdlimit
2024-12-15T08:38:54.0704803Z ===================================
2024-12-15T08:38:54.0704807Z 
2024-12-15T08:38:54.0704869Z 11
2024-12-15T08:38:54.0704931Z 1
2024-12-15T08:38:54.0704995Z 6
2024-12-15T08:38:54.0705053Z 3
2024-12-15T08:38:54.0705160Z 12
2024-12-15T08:38:54.0705269Z 13
2024-12-15T08:38:54.0705384Z 17
2024-12-15T08:38:54.0705504Z 4
2024-12-15T08:38:54.0705626Z 5
2024-12-15T08:38:54.0705781Z 14
2024-12-15T08:38:54.0705892Z 15
2024-12-15T08:38:54.0706004Z 8
2024-12-15T08:38:54.0706121Z 16
2024-12-15T08:38:54.0706280Z 10
2024-12-15T08:38:54.0706397Z 7
2024-12-15T08:38:54.0706522Z 9
2024-12-15T08:38:54.0706631Z 2
2024-12-15T08:38:54.0706893Z sort: failed to open temporary file: Too many open files
2024-12-15T08:38:54.0707133Z FAIL tests/sort/sort-merge-fdlimit.sh (exit status: 1)

@karlmcdowall
Copy link
Contributor Author

Unfortunately there are several issues that need to be fixed before sort-merge-fdlimit will pass - there are several places in the code that aren't optimal in their use of file-descriptors.

I've written a new test case as part of this PR to demonstrate the issue that this PR fixes - the test case would previously fail but now passes.

What's the best way for me to proceed with fixing these issues? I can understand that on one hand it would be nice to have a single PR that can fix the sort-merge-fdlimit test (i.e. this would confirm that my changes are legit and not a waste of time). On the other hand, if there are several distinct issues then the fixes will be easier to code-review as separate PRs.

Let me know how you want me to queue this stuff up 👍

Fix bug uutils#6944
Rework the way batching is done with sort such that it doesn't open
more input files than necessary. Previously, the code would always
open one extra input file which causes problems in ulimit scenarios.
Add additional test case.
@karlmcdowall karlmcdowall force-pushed the sort_merge_chunking_rework branch from e888500 to 6bdcad3 Compare December 15, 2024 19:40
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/tail/inotify-dir-recreate (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

@sylvestre
Copy link
Contributor

Let me know how you want me to queue this stuff up 👍

nope, this is fine :)
thanks

@sylvestre sylvestre merged commit 913d5d4 into uutils:main Dec 21, 2024
61 of 62 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sort merge chunking opens an extra file
2 participants