daemon: handleContainerExit: ignore networking errors #49507

akerouanton · 2025-02-20T13:57:42Z

fixes Containers with a port already in use restart in a tight loop #49501

Related to:

Allocate IPv6 addresses after detecting IPv6 support #47406

- What I did

Prior to commit fe856b9, containers' network sandbox and interfaces were created before the containerd task. Now, it's created after.

If this step fails, the containerd task is forcefully deleted, and an event is sent to the c8d event monitor, which triggers handleContainerExit. Then this method tries to restart the faulty container.

This leads to containers with a published port already in use to be stuck in a tight restart loop (if they're started with --restart=always) until the port is available. This is needlessly spamming the daemon logs.

Prior to that commit, a published port already in use wouldn't trigger the restart process.

- How I did it

This commit adds a check to handleContainerExit to ignore exit events if the latest container error is related to networking setup.

- How to verify it

- Human readable description for the release notes

Fix a bug that was causing containers with `--restart=always` and a published port already in use to be restarting in a tight loop.

- A picture of a cute animal (not mandatory but encouraged)

daemon/monitor.go

vvoland

LGTM

vvoland · 2025-02-20T16:47:57Z

daemon/monitor.go

+	// c.ErrorMsg is set by [daemon.containerStart], and doesn't preserve the
+	// error type (because this field is persisted on disk). So, use string
+	// matching instead of usual error comparison methods.
+	if strings.Contains(c.ErrorMsg, "failed to set up container networking") {


For posterity: We could also consider doing that in in general for any c.ErrorMsg != "" error, but for now let's go with a minimal change to get the pre-v28 behavior.

robmry

LGTM

robmry · 2025-02-20T16:53:18Z

daemon/start_linux.go

@@ -34,5 +34,8 @@ func (daemon *Daemon) initializeCreatedTask(
 			return errdefs.System(err)
 		}
 	}
-	return daemon.allocateNetwork(ctx, cfg, ctr)
+	if err := daemon.allocateNetwork(ctx, cfg, ctr); err != nil {
+		return fmt.Errorf("failed to set up container networking: %w", err)


The test failure should be enough to avoid regressions. But, as these are both in package daemon, could put the error string in a const to make it clear where the message is used?

Prior to commit fe856b9, containers' network sandbox and interfaces were created before the containerd task. Now, it's created after. If this step fails, the containerd task is forcefully deleted, and an event is sent to the c8d event monitor, which triggers `handleContainerExit`. Then this method tries to restart the faulty container. This leads to containers with a published port already in use to be stuck in a tight restart loop (if they're started with `--restart=always`) until the port is available. This is needlessly spamming the daemon logs. Prior to that commit, a published port already in use wouldn't trigger the restart process. This commit adds a check to `handleContainerExit` to ignore exit events if the latest container error is related to networking setup. Signed-off-by: Albin Kerouanton <albinker@gmail.com>

neersighted

LGTM

akerouanton commented Feb 20, 2025

View reviewed changes

daemon/monitor.go Outdated Show resolved Hide resolved

akerouanton force-pushed the fix-restart-port-already-in-use branch from 874796b to a34be7e Compare February 20, 2025 16:09

akerouanton changed the title ~~daemon: handleContainerExit: ignore unstarted containers~~ daemon: handleContainerExit: ignore networking errors Feb 20, 2025

akerouanton added the impact/changelog label Feb 20, 2025

akerouanton self-assigned this Feb 20, 2025

akerouanton added area/networking area/daemon kind/bugfix process/cherry-pick process/cherry-pick/28.x labels Feb 20, 2025

vvoland added this to the 29.0.0 milestone Feb 20, 2025

akerouanton force-pushed the fix-restart-port-already-in-use branch from a34be7e to aa45c80 Compare February 20, 2025 16:42

akerouanton modified the milestones: 29.0.0, 28.0.1 Feb 20, 2025

akerouanton marked this pull request as ready for review February 20, 2025 16:43

vvoland approved these changes Feb 20, 2025

View reviewed changes

robmry approved these changes Feb 20, 2025

View reviewed changes

akerouanton force-pushed the fix-restart-port-already-in-use branch from aa45c80 to ac8b4e3 Compare February 20, 2025 17:03

akerouanton requested a review from robmry February 20, 2025 17:03

neersighted approved these changes Feb 20, 2025

View reviewed changes

thompson-shaun modified the milestones: 29.0.0, 28.0.1 Feb 20, 2025

thaJeztah merged commit f0f008b into moby:master Feb 20, 2025
153 checks passed

akerouanton deleted the fix-restart-port-already-in-use branch February 21, 2025 09:09

vvoland added process/cherry-picked and removed process/cherry-pick process/cherry-pick/28.x labels Feb 21, 2025

vvoland mentioned this pull request Feb 25, 2025

Containers with a port already in use restart in a tight loop #49501

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

daemon: handleContainerExit: ignore networking errors #49507

daemon: handleContainerExit: ignore networking errors #49507

akerouanton commented Feb 20, 2025 •

edited by thaJeztah

Loading

vvoland left a comment

vvoland Feb 20, 2025

robmry left a comment

robmry Feb 20, 2025

akerouanton Feb 20, 2025

neersighted left a comment

daemon: handleContainerExit: ignore networking errors #49507

daemon: handleContainerExit: ignore networking errors #49507

Conversation

akerouanton commented Feb 20, 2025 • edited by thaJeztah Loading

vvoland left a comment

Choose a reason for hiding this comment

vvoland Feb 20, 2025

Choose a reason for hiding this comment

robmry left a comment

Choose a reason for hiding this comment

robmry Feb 20, 2025

Choose a reason for hiding this comment

akerouanton Feb 20, 2025

Choose a reason for hiding this comment

neersighted left a comment

Choose a reason for hiding this comment

akerouanton commented Feb 20, 2025 •

edited by thaJeztah

Loading