Fix deadlock during NRI plugin registration #79
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
During NRI external plugin registration:
nri/pkg/adaptation
is acquiredsyncFn
is invokedsyncFn
acquires NRI lock inpkg/nri/nri.go
During container lifecycle events such as
ContainerStart
StateChange()
innri/pkg/adaptation
As a result, the locking order during NRI plugin registration is:
While the locking order during container starts is:
Due the fact that the locking order is inverted and not consistent, it it possible to encounter a deadlock.
To fix the issue, during NRI plugin registration, first acquire the NRI lock (done via
syncFn
call) and only after acquire the adaptation lock. This ensures that NRI plugin registration the locking order is adaption lock -> NRI lock, which is consistent with the locking order during container lifecycle events.Fixes containerd/containerd#10085