Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reduce the lock contention in task spawn. #6001
reduce the lock contention in task spawn. #6001
Changes from 100 commits
895958f
ae64dfe
5ffbf01
8e2c0b2
33ab489
ab2452c
a500a79
632a8d3
6453017
9bfb4f1
41ee62a
b4ac885
4b386c5
9ded74b
191edf6
06a675c
2be70c4
17b0be9
3bb484e
e325825
a11f80c
259582e
cab1f58
389e6b9
833377c
5422895
b90101a
68af71a
8e4716a
4a3ff7a
7bbc2e4
c15c0bd
6706a35
d2c7668
5105610
a30df11
df4ab61
63a7679
b2010d7
052e141
65670eb
08d7d0c
26621d4
5b010bc
66fa190
2ac0b96
47820b3
d0acd70
7ef0265
01da1ed
6f5eaa2
3257cb7
7b101ee
608d2c6
1c214e0
e24db6d
3c90918
cd5fb20
38c9eba
7968b51
5d3da9e
1d1a7a3
13c4b93
ee53e23
01afd7b
36c2355
d6606b8
c9f32d2
bbefb70
f97748c
8baf79e
87e70c3
af92f20
60104b8
bb4458b
1adad70
ef8d2b7
2d4fbf6
777d97e
5406a7e
c9b05ee
680848e
6fb70b1
0ba87db
8bb106f
e0fb9e2
57133a5
71e8983
d1ce613
e6b8db1
592f432
f083757
3105af7
8e189cf
75d3081
ed27f70
db58076
06656b9
caa74c9
865de08
7a76ad5
78d1dea
3844bd3
038650f
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This maximum seems really large to me. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just used the settings of the previous
spawn_concurrency_level
function. The previous function could support user configuration, so this value was set to a large value.If you want a smaller value, I'm ok with that.
As far as I know, some current server CPUs can reach close to 200 CPU cores (or hyperthreadings), such as the AMD EPYC 9654 which has 192 threads. Maybe we can set the value to 256(4 times this value is
1<<10
)?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to retract my previous point. I believe this should be a memory-bound scenario, so we only need to ensure a certain number of mutexes. In fact, the number of mutexes should not depend on the number of worker threads, but actually depends on the possible level of concurrency.
Below, I describe the concurrency level that may actually occur, rather than what occurs in the benchmark.
In real applications that use Tokio, multiple threads may perform
tokio::spawn
concurrently. However, having too many threads concurrently performingtokio::spawn
at the same time can be considered a poor design choice in the higher-level application architecture. It is more likely that multiple threads will perform removing tasks concurrently at the same time. Nonetheless, removing a task from the list only takes up a minimal amount of runtime during the entire task life cycle, so the concurrency level of removing tasks is usually not a significant concern.Therefore, I agree with your idea, and it is reasonable to set the upper limit to a small value. Setting this to 64 or 128 might make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright. What do you think about multiplying by 4?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The total number of random memory operations is consistent, providing multiple locks. Each thread obtains one of the locks through round robin, and can only perform random memory access after obtaining the lock. I got the following test results:
This is mainly because when the number of worker threads is relatively small, setting the number of locks to 4 times has a significant performance improvement compared to 2 times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For complete testing, please refer to https://gist.github.com/wathenjiang/30b689a7ef20b4ea667a2e8f358c321d
I think this performance test can provide a reference for how many mutexes are needed for OwnedTasks.