Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fork] simplify Fork::SetResetChildPollingEngineFunc to fix nested forking #33495

Merged
merged 3 commits into from
Jun 21, 2023

Conversation

apolcyn
Copy link
Contributor

@apolcyn apolcyn commented Jun 20, 2023

Noticed this in a nested forking test in #33430 (by nested fork - forked child process forks again).

Before this change, the EventEngine and non-EventEngine pollers were competing with each other in their calls to Fork::SetResetChildPollingEngineFunc. In the first child's after-fork handler, the EE engine would clear out the post-fork handlers of the non-EE poller, leaving the grandchild without the right post-fork cleanup method.

@apolcyn apolcyn added lang/core release notes: yes Indicates if PR needs to be in release notes labels Jun 20, 2023
@apolcyn apolcyn requested review from gnossen and drfloob June 20, 2023 18:28
@apolcyn apolcyn marked this pull request as ready for review June 20, 2023 18:32
@drfloob drfloob requested a review from Vignesh2208 June 20, 2023 21:49
grpc_core::Fork::SetResetChildPollingEngineFunc(ResetEventManagerOnFork);
if (grpc_core::Fork::RegisterResetChildPollingEngineFunc(
ResetEventManagerOnFork)) {
gpr_mu_init(&fork_fd_list_mu);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for the child fork handler to run before the mutex is initialized?

Copy link
Contributor Author

@apolcyn apolcyn Jun 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not: the first time we call this, RegisterResetChildPollingEngineFunc will return true, we will initialize the mutex, and we'll leave it initialized thereafter.

So the only way we could possibly race here is if we forked while this thread was running, which would put us in a broken/undefined state anyways.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only way we could possibly race here is if we forked while this thread was running

That's what I'm thinking. It is possible, probably rare, but having the mutex initialization before the fork-handler registration would protect it a bit. If the handler registration is the last thing in this init method, I'm not quite sure how that leaves the poller in a broken/undefned state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we fork while this thread is running, that means we've just chopped this thread. That leaves us in undefined behavior land (e.g. locks could easily be held elsewhere in this thread that still need to be released), so I don't think it's worth making any changes to support that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not going to block this PR on a rare edge case, it's fine, especially if our fork tests are passing. And if we see new fork problems in the wild where a null mutex is accessed, we might have a suspect change. Threads aren't exactly "chopped" when a fork occurs and fork support is engaged, the pre-fork handlers do some work in letting threads come to a resting spot (in some cases shutting them down) before allowing the fork to proceed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I guess my comment is an over-statement in the case that we've registered pthread_atfork handlers.

Still, in that case, we must be doing some kind of graceful shutdown and joining threads. Because there's no path to exit the function between RegisterResetChildPollingEngineFunc and gpr_mu_init, running one without the other doesn't appear to be possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, we are registering a post-fork handler here, not a pre-fork handler. The pre-fork handlers will not allow a fork to occur between these statements. That should be fine then.

}

const std::vector<Fork::child_postfork_func>&
const std::set<Fork::child_postfork_func>&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is mostly a good thing, but we're adding non-determinism to callback execution ordering. This may unearth bugs based on implicit ordering requirements. I hope there aren't such dependencies between polling engines, but we should be aware of the possibility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

@apolcyn apolcyn merged commit e9c4483 into grpc:master Jun 21, 2023
64 of 65 checks passed
@copybara-service copybara-service bot added the imported Specifies if the PR has been imported to the internal repository label Jun 22, 2023
apolcyn added a commit that referenced this pull request Jul 10, 2023
Adds experimental fork support to gRPC/Ruby

Works towards #8798 (see caveats for why this wasn't marked fixed yet)
Works towards #33578 (see caveats for why this wasn't marked fixed yet)

This leverages existing `pthread_atfork` based C-core support for
forking that python/php use, but there's a bit extra involved mainly
because gRPC/Ruby has additional background threads.

New tests under `src/ruby/end2end` show example usage.

Based on #33495

Caveats:
- Bidi streams are not yet supported (bidi streams spawn background
threads which are not yet fork safe)
- Servers not supported
- Only linux supported
mario-vimal pushed a commit to mario-vimal/grpc that referenced this pull request Jul 13, 2023
Adds experimental fork support to gRPC/Ruby

Works towards grpc#8798 (see caveats for why this wasn't marked fixed yet)
Works towards grpc#33578 (see caveats for why this wasn't marked fixed yet)

This leverages existing `pthread_atfork` based C-core support for
forking that python/php use, but there's a bit extra involved mainly
because gRPC/Ruby has additional background threads.

New tests under `src/ruby/end2end` show example usage.

Based on grpc#33495

Caveats:
- Bidi streams are not yet supported (bidi streams spawn background
threads which are not yet fork safe)
- Servers not supported
- Only linux supported
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bloat/none imported Specifies if the PR has been imported to the internal repository lang/core per-call-memory/neutral per-channel-memory/neutral release notes: yes Indicates if PR needs to be in release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants