Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edit '/etc/hosts' to add/remove IPv6 records. #46908

Closed
wants to merge 2 commits into from

Conversation

robmry
Copy link
Contributor

@robmry robmry commented Dec 7, 2023

- What I did

Only add built-in records ('ip6-localhost', 'ip6-mcastprefix', ...) to a container's '/etc/hosts' when the container is connected to an IPv6 network - and remove them when it's no longer connected.

Fixes #35954

- How I did it

Use the existing etchosts functionality to add/remove IPv6-related built-in records from /etc/hosts when an endpoint is added/removed from a container's sandbox.

IPv6 records are always added at the top of the /etc/hosts file. This changes the order of entries, the IPv4 loopback host is currently always added so it now comes after the IPv6 - the ordering was checked by some tests, which I've updated.

A matching follow-up change will probably be needed to add/remove the IPv4 localhost record according to whether the container has IPv4 connectivity.

The built-in IPv6 hostnames may be DNS-resolvable when running in 'rootless' mode because, in that mode, entries in the host's /etc/hosts are used to respond to DNS requests from the container.

- How to verify it

New integration test, and updated unit tests.

With live-restore enabled, restarted the daemon, checked that previously-running containers were still updated as expected when IPv6 connectivity changed.

- Description for the changelog

Edit '/etc/hosts' to add/remove IPv6 records when IPv6 connectivity changes.

- A picture of a cute animal (not mandatory but encouraged)

@robmry robmry marked this pull request as draft December 7, 2023 15:34
@robmry robmry force-pushed the 35954-no_ip6_hosts_unless_ip6 branch 2 times, most recently from 003f8a4 to 10317f4 Compare December 11, 2023 13:45
@robmry robmry marked this pull request as ready for review December 11, 2023 14:33
@robmry robmry marked this pull request as draft December 11, 2023 15:08
@robmry robmry force-pushed the 35954-no_ip6_hosts_unless_ip6 branch from 10317f4 to 49efbcf Compare December 11, 2023 17:15
@robmry robmry marked this pull request as ready for review December 11, 2023 18:05
@thaJeztah thaJeztah added this to the 25.0.0 milestone Dec 11, 2023
Copy link
Member

@akerouanton akerouanton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good. I just left two comments about code comments that need to be updated.

libnetwork/etchosts/etchosts.go Show resolved Hide resolved
@@ -1455,6 +1455,10 @@ func (n *Network) getSvcRecords(ep *Endpoint) []etchosts.Record {
return nil
}

// TODO(robmry) - should this also look at sr.svcIP6Map?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you noted in your comment, IPv6 addresses aren't removed explicitly but they're still removed. I think it's okay to keep this behavior but your comment is much needed to clarify what's going on. Can you remove this TODO and keep the rest of the comment please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure - done.

Only add built-in records ('ip6-localhost', 'ip6-mcastprefix', ...) to a
container's '/etc/hosts' when the container is connected to an IPv6
network - and remove them when it's no longer connected.

Signed-off-by: Rob Murray <rob.murray@docker.com>
@robmry robmry force-pushed the 35954-no_ip6_hosts_unless_ip6 branch from 49efbcf to cc8f197 Compare December 15, 2023 15:06
@robmry
Copy link
Contributor Author

robmry commented Dec 15, 2023

For discussion ...

Editing /etc/hosts has risks because the "etchosts" code writes the file in-place, moving it into place isn't an option with the bind mount. Lots of discussion on that in #17190

This seems less-dangerous than that change, which edited the file for each new container, because:

  • Updates are serialised
    • But, could still collide with an edit inside the container, and a reader could see a partial file.
  • It'll only happen:
    • during container construction (no problem), and
    • when networks are connected/disconnected, only if they have different IPv6-ness.

Other ideas for this change weren't great, but maybe there's another way to do it? ...

  • a container-create, or engine-wide, option to say IPv6 will never be required
    • but the IPv6-related flags are already confusing
  • just enable '::1' unconditionally
    • but that's explicitly not what users are asking for
  • look at sysctl settings to work out whether IPv6 is disabled system-wide
    • but I'm not sure there's a reliable way to tell (?)

@akerouanton
Copy link
Member

akerouanton commented Dec 15, 2023

If we want to prevent concurrent writes to /etc/hosts from within the container, I think we need to look at fd leases. I think this is the right mechanism for what we want to do. The issue reported by Tim Hockin in this comment #17190 (comment) doesn't exist anymore -- I've tested it myself with the following Go code:

package main

import (
	"os"
	"time"

	"golang.org/x/sys/unix"
)

func main() {
	f, err := os.Open("hosts")
	if err != nil {
		panic(err)
	}
	_, err = unix.FcntlInt(f.Fd(), unix.F_SETLEASE, unix.F_WRLCK)
	if err != nil {
		panic(err)
	}
	time.Sleep(3 * time.Second)
}

Looking at the kernel's source code & git history, I think this was solved by either or both:

Based on my notes (although I'm not sure they're entirely exact / up-to-date), both fixes will be available on all distros we support on July 1st, once RHEL/CentOS 7.9 & 8 reach EOL 🎉 At this point, I think we shouldn't care much about these systems (ie. either use a file lock, use rename for atomic update, or do nothing special).

@robmry robmry force-pushed the 35954-no_ip6_hosts_unless_ip6 branch from 0d0ded5 to b8446ea Compare January 2, 2024 10:09
@robmry
Copy link
Contributor Author

robmry commented Jan 2, 2024

I've added a second commit that makes a best-effort attempt to acquire a write lease on /etc/hosts - it's only for a Linux host and, even on Linux, if it fails the file will be updated anyway.

If there's an open descriptor for the file, read or write, the request for a write lease fails immediately. If that happens this new code will retry once, after a 10ms sleep (an arbitrary but short time, plenty of time for a normal reader to have gone away, but not long enough to cause a noticeable delay if the file is held open).

If the lease is obtained, it will ensure that nothing in the container gets hold of the file while it's being updated. In other cases (as agreed in the maintainer's call on 21st Dec), it's unlikely to be a problem - the file is tiny, the change is infrequent and corresponds to a network config change that might be expected to cause disruption. And, the file was being updated anyway to add/remove the container name - although that's probably not necessary when the internal DNS resolver is running.

@robmry robmry force-pushed the 35954-no_ip6_hosts_unless_ip6 branch from b8446ea to bc04a91 Compare January 3, 2024 09:30
if attempt > 0 {
time.Sleep(opts.interval)
}
_, retErr = unix.FcntlInt(f.Fd(), unix.F_SETLEASE, unix.F_WRLCK)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you also need to return NotImplementedError if fnctl returns EINVAL (to correctly handle RHEL & CentOS 7 / 8).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, yes - that should stop it from logging anything on those platforms. I wasn't sure that was what we'd want, because on Linux we might assume no-log means it got a lease. But, it will prevent any unnecessary concern about what "opened without lease" means, and actual problems that need investigating are very unlikely.

libnetwork/internal/filelease/filelease.go Outdated Show resolved Hide resolved
When possible, on Linux, obtain a write lease while editing '/etc/hosts'
in the container - using fcntl(2)/F_SETLEASE.

Signed-off-by: Rob Murray <rob.murray@docker.com>
@robmry robmry force-pushed the 35954-no_ip6_hosts_unless_ip6 branch from bc04a91 to d11be4a Compare January 4, 2024 09:36
Copy link
Member

@akerouanton akerouanton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@robmry robmry closed this Jan 12, 2024
@robmry
Copy link
Contributor Author

robmry commented Jan 12, 2024

Closed in favour of #47062

@robmry robmry deleted the 35954-no_ip6_hosts_unless_ip6 branch March 27, 2024 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

/etc/hosts contains IPv6 entries even when IPv6 is not enabled
3 participants