Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simple run command with --sysctl for network interface fails after upgrade #47619

Closed
jwfang opened this issue Mar 23, 2024 · 4 comments · Fixed by #47621 or #47635
Closed

simple run command with --sysctl for network interface fails after upgrade #47619

jwfang opened this issue Mar 23, 2024 · 4 comments · Fixed by #47621 or #47635
Assignees
Labels
area/networking kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/confirmed version/26.0

Comments

@jwfang
Copy link

jwfang commented Mar 23, 2024

Description

after upgrade my Debian from bullseye to bookworm today, my container failed to work.

i traced it down to this simple command:

docker run --rm --sysctl net.ipv4.conf.eth0.forwarding=1 alpine
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: open /proc/sys/net/ipv4/conf/eth0/forwarding: no such file or directory: unknown.

but when the container is started, the eth0 conf is right there. maybe network interface renaming timing changed ?
and if i change the interface name to lo or all, the above command works fine.

unfortunately, i have to rely on the interface name: if i use all to set both forwarding=1 and accept_ra=2, the container seems not respected accept_ra setting. in other words, if i use following commands, the containers won't get its IPv6 address form RA. (XXX is my custom IPv6 enabled MacVLAN network)

docker run -it --rm --network XXX --sysctl net.ipv6.conf.all.forwarding=1 --sysctl net.ipv6.conf.all.accept_ra=2 ubuntu bash
docker run -it --rm --network XXX --sysctl net.ipv6.conf.all.forwarding=1 --sysctl net.ipv6.conf.all.accept_ra=2 alpine sh

Reproduce

for --sysctl failure:

  1. docker run --rm --sysctl net.ipv4.conf.eth0.forwarding=1 alpine sh will fail

for no IPv6 address from RA (this probably not related to docker, just i can't use all for interface name)

  1. create a IPv6 enabled network XXX
  2. docker run -it --rm --network XXX --sysctl net.ipv6.conf.all.forwarding=1 --sysctl net.ipv6.conf.all.accept_ra=2 alpine sh will not get IPv6 address from RA

Expected behavior

No response

docker version

Client: Docker Engine - Community
 Version:           26.0.0
 API version:       1.45
 Go version:        go1.21.8
 Git commit:        2ae903e
 Built:             Wed Mar 20 15:18:02 2024
 OS/Arch:           linux/arm64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          26.0.0
  API version:      1.45 (minimum version 1.24)
  Go version:       go1.21.8
  Git commit:       8b79278
  Built:            Wed Mar 20 15:18:02 2024
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.6.28
  GitCommit:        ae07eda36dd25f8a1b98dfbf587313b99c0190bb
 runc:
  Version:          1.1.12
  GitCommit:        v1.1.12-0-g51d5e94
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

Client: Docker Engine - Community
 Version:    26.0.0
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.13.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.25.0
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 5
  Running: 5
  Paused: 0
  Stopped: 0
 Images: 16
 Server Version: 26.0.0
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: ae07eda36dd25f8a1b98dfbf587313b99c0190bb
 runc version: v1.1.12-0-g51d5e94
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.1.21-v8+
 Operating System: Debian GNU/Linux 12 (bookworm)
 OSType: linux
 Architecture: aarch64
 CPUs: 4
 Total Memory: 3.705GiB
 Name: rpi4
 ID: 60af6eb1-813d-4d13-929e-23993c2a56dc
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No memory limit support
WARNING: No swap limit support

Additional Info

No response

@jwfang jwfang added kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage labels Mar 23, 2024
@jwfang
Copy link
Author

jwfang commented Mar 23, 2024

i pinned docker-ce to 25.0.5, my container works as before.

so i guess it's caused by the recent 26 release.

@robmry
Copy link
Contributor

robmry commented Mar 23, 2024

Hi @jwfang - thank you for narrowing down the issue and raising the clear report.

It's fallout from #47062 - in 0046b16 we moved some of the network configuration from a pre-start hook in the runtime to after the container task has been created.

As you suggest, that means the network interface renaming (moving one end of a veth device into the container namespace in Sandbox.populateNetworkResources, sb.osSbox.AddInterface) happens after sysctls are applied by the runtime.

cc @corhere - I think we'll need to go back to using the pre-start hook.

@corhere
Copy link
Contributor

corhere commented Mar 25, 2024

Given how brittle it is to use --sysctl for per-iface config, having to predict the interface name and messing with the configuration of a libnetwork-managed interface, maybe this isn't something we should try to support going forward. What if instead we provided some affordance such as an endpoint option to set interface sysctls without having to predict the interface name? That way libnetwork could apply the sysctls itself after it has created and renamed the interfaces, and it could refuse to apply sysctls that would be incompatible with particular network drivers.

@thaJeztah
Copy link
Member

@corhere have you been peeking into our internal slack? 😂 we were discussing exactly that, and for the same reason (my choice of words ("network connection") was a bit poor, but same intent);
Screenshot 2024-03-26 at 08 52 55

jjanowsk added a commit to NordSecurity/libtelio that referenced this issue Apr 4, 2024
moby/moby#47619
Because of this bug eth0 is not created before sysctl are being set so
it is not possible to set any syscts related to the specific
interfaces.
We can create a workaround by not setting eth0. Instead we can set
default so eth0 will have this value assigned later when created.
jjanowsk added a commit to NordSecurity/libtelio that referenced this issue Apr 5, 2024
moby/moby#47619
Because of this bug eth0 is not created before sysctl are being set so
it is not possible to set any syscts related to the specific
interfaces.
We can create a workaround by not setting eth0. Instead we can set
default so eth0 will have this value assigned later when created.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/networking kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/confirmed version/26.0
Projects
None yet
5 participants