-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
daemon: handleContainerExit: ignore networking errors #49507
daemon: handleContainerExit: ignore networking errors #49507
Conversation
874796b
to
a34be7e
Compare
a34be7e
to
aa45c80
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
daemon/monitor.go
Outdated
// c.ErrorMsg is set by [daemon.containerStart], and doesn't preserve the | ||
// error type (because this field is persisted on disk). So, use string | ||
// matching instead of usual error comparison methods. | ||
if strings.Contains(c.ErrorMsg, "failed to set up container networking") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For posterity: We could also consider doing that in in general for any c.ErrorMsg != ""
error, but for now let's go with a minimal change to get the pre-v28 behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
daemon/start_linux.go
Outdated
@@ -34,5 +34,8 @@ func (daemon *Daemon) initializeCreatedTask( | |||
return errdefs.System(err) | |||
} | |||
} | |||
return daemon.allocateNetwork(ctx, cfg, ctr) | |||
if err := daemon.allocateNetwork(ctx, cfg, ctr); err != nil { | |||
return fmt.Errorf("failed to set up container networking: %w", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test failure should be enough to avoid regressions. But, as these are both in package daemon
, could put the error string in a const to make it clear where the message is used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Prior to commit fe856b9, containers' network sandbox and interfaces were created before the containerd task. Now, it's created after. If this step fails, the containerd task is forcefully deleted, and an event is sent to the c8d event monitor, which triggers `handleContainerExit`. Then this method tries to restart the faulty container. This leads to containers with a published port already in use to be stuck in a tight restart loop (if they're started with `--restart=always`) until the port is available. This is needlessly spamming the daemon logs. Prior to that commit, a published port already in use wouldn't trigger the restart process. This commit adds a check to `handleContainerExit` to ignore exit events if the latest container error is related to networking setup. Signed-off-by: Albin Kerouanton <albinker@gmail.com>
aa45c80
to
ac8b4e3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Related to:
- What I did
Prior to commit fe856b9, containers' network sandbox and interfaces were created before the containerd task. Now, it's created after.
If this step fails, the containerd task is forcefully deleted, and an event is sent to the c8d event monitor, which triggers
handleContainerExit
. Then this method tries to restart the faulty container.This leads to containers with a published port already in use to be stuck in a tight restart loop (if they're started with
--restart=always
) until the port is available. This is needlessly spamming the daemon logs.Prior to that commit, a published port already in use wouldn't trigger the restart process.
- How I did it
This commit adds a check to
handleContainerExit
to ignore exit events if the latest container error is related to networking setup.- How to verify it
- Human readable description for the release notes
- A picture of a cute animal (not mandatory but encouraged)