Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker v26.0.0 breaks DNS #5800

Open
5 tasks done
dogsbody opened this issue Mar 22, 2024 · 16 comments
Open
5 tasks done

Docker v26.0.0 breaks DNS #5800

dogsbody opened this issue Mar 22, 2024 · 16 comments

Comments

@dogsbody
Copy link

Contribution guidelines

I've found a bug and checked that ...

  • ... I understand that not following the below instructions will result in immediate closure and/or deletion of my issue.
  • ... I have understood that this bug report is dedicated for bugs, and not for support-related inquiries.
  • ... I have understood that answers are voluntary and community-driven, and not commercial support.
  • ... I have verified that my issue has not been already answered in the past. I also checked previous issues.

Description

Since Upgrading to Docker 26.0.0 Mailcow is producing lots of DNS errors.

This MAY be connected to the following depreciation in Docker 26.0.0!!??


CVE-2024-29018
: Do not forward requests to external DNS servers for a container that is only connected to an 'internal' network. Previously, requests were forwarded if the host's DNS server was running on a loopback address, like systemd's 127.0.0.53. moby/moby#47589

Source: https://docs.docker.com/engine/release-notes/26.0/#bug-fixes-and-enhancements



### Logs:

```plain text
Mar 22 08:02:03 olive dockerd[220407]: time="2024-03-22T08:02:03.235990882Z" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.253:44521" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.253:44521->172.22.1.254:53: i/o timeout" question=";133.138.180.139.bl.spamcop.net.\tIN\t A" spanID=6a533445a593b326 traceID=e8790345dcf9b4465b776ee33783b6cd
Mar 22 08:02:38 olive dockerd[220407]: time="2024-03-22T08:02:38.492917213Z" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.253:43701" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.253:43701->172.22.1.254:53: i/o timeout" question=";2.4.8.d.7.a.e.f.f.f.1.0.0.0.4.5.f.9.3.2.5.0.0.0.0.f.9.1.1.0.0.2.b.barracudacentral.org.\tIN\t A" spanID=af98360abc780b84 traceID=22daca411668dcd55f3f5d773b9b5c1e
Mar 22 08:02:38 olive dockerd[220407]: time="2024-03-22T08:02:38.500301774Z" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.253:42026" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.253:42026->172.22.1.254:53: i/o timeout" question=";2.4.8.d.7.a.e.f.f.f.1.0.0.0.4.5.f.9.3.2.5.0.0.0.0.f.9.1.1.0.0.2.bl.spamcop.net.\tIN\t A" spanID=d18c21e1eb7f6063 traceID=531742d1fafa14d82b97d8b1e3963bf2
Mar 22 08:02:42 olive dockerd[220407]: time="2024-03-22T08:02:42.501940834Z" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.253:45499" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.253:45499->172.22.1.254:53: i/o timeout" question=";2.4.8.d.7.a.e.f.f.f.1.0.0.0.4.5.f.9.3.2.5.0.0.0.0.f.9.1.1.0.0.2.bl.spamcop.net.\tIN\t A" spanID=d1f1e178d36c79f5 traceID=b1e263f5139bbd02f59b05b89f0926e6
Mar 22 08:03:02 olive dockerd[220407]: time="2024-03-22T08:03:02.268089673Z" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.253:39105" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.253:39105->172.22.1.254:53: i/o timeout" question=";11.16.227.165.bl.spamcop.net.\tIN\t A" spanID=d904f6f2b08ce936 traceID=3d00cbfc2ba83136d65b88c0dfc68b85

Steps to reproduce:

Run Mailcow on Docker v26.0.0

Which branch are you using?

master

Which architecture are you using?

x86

Operating System:

Ubuntu 22.04 LTS

Server/VM specifications:

2 cores, 4GB RAM

Is Apparmor, SELinux or similar active?

no

Virtualization technology:

KVM I think, it's a VPS server

Docker version:

v26.0.0

docker-compose version or docker compose version:

v2.25.0

mailcow version:

2024-02

Reverse proxy:

None

Logs of git diff:

N/A

Logs of iptables -L -vn:

N/A

Logs of ip6tables -L -vn:

N/A

Logs of iptables -L -vn -t nat:

N/A

Logs of ip6tables -L -vn -t nat:

N/A

DNS check:

104.18.32.7
172.64.155.249
@dogsbody
Copy link
Author

Any update on this please? People on Docker v26 will have no RBL functionality on their server until we find a fix. Thank you

@mbu147
Copy link

mbu147 commented Mar 27, 2024

@dogsbody where exactly are you seeing these log messages?
I upgraded to Docker v26.0.0 this morning and mailcow runs for several hours without any issues.

@dogsbody-josh
Copy link

With either of these commands

grep "failed to query external DNS" /var/log/syslog
journalctl --since today | grep "failed to query external DNS"

Here's another line from another Mailcow server from a few minutes ago

Mar 27 09:25:50 mail01 dockerd[355773]: time="2024-03-27T09:25:50.895007331Z" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.253:44732" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.253:44732->172.22.1.254:53: i/o timeout" question=";169.33.6.112.in-addr.arpa.\tIN\t PTR" spanID=ca8733dffb2eeeb8 traceID=2e6f7ee85a991ced1b514afd24216fd9

@mbu147
Copy link

mbu147 commented Mar 27, 2024

Oh - I only checked the container logs. Thanks!
I can also find these messages in the systemd logs on my system (AlmaLinux 9.3)

@Cisco30
Copy link

Cisco30 commented Mar 30, 2024

hi, I also see errors

mars 30 13:11:22 Dell-9010 dockerd[536]: time="2024-03-30T13:11:22.951801038+01:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.253:49241" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.253:49241->172.22.1.254:53: i/o timeout" question=";254.1.168.192.dnsbl.sorbs.net.\tIN\t A" spanID=74c0fb8ccfe6e7b6 traceID=bf95bdcf3b6cb5edf720eb2781f94ffc
mars 30 13:18:04 Dell-9010 dockerd[536]: time="2024-03-30T13:18:04.983303145+01:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.14:60250" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.14:60250->172.22.1.254:53: i/o timeout" question=";_25._tcp.mail.datanetwork.cloud.\tIN\t ANY" spanID=14de718d26666066 traceID=aab42a4edc17280d27b4028b49f42ddd
mars 30 13:18:07 Dell-9010 dockerd[536]: time="2024-03-30T13:18:07.486205113+01:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.14:40350" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.14:40350->172.22.1.254:53: i/o timeout" question=";_25._tcp.mail.datanetwork.cloud.\tIN\t ANY" spanID=54772f74dbbea61a traceID=ff8ad12e7e09713747d856a566f57cbd
mars 30 13:18:08 Dell-9010 dockerd[536]: time="2024-03-30T13:18:08.984187656+01:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.14:60524" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.14:60524->172.22.1.254:53: i/o timeout" question=";_25._tcp.mail.datanetwork.cloud.\tIN\t ANY" spanID=fd9cc2cde9d88045 traceID=1677e97317a7f6226bff8aed526956c9

@MatthieuLeboeuf
Copy link
Contributor

MatthieuLeboeuf commented Apr 1, 2024

Hey
the same observation

Apr 01 12:06:42 mail dockerd[2352174]: time="2024-04-01T12:06:42.544981137+02:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.3:50070" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.3:50070->172.22.1.254:53: read: connection refused" question=";current.cvd.clamav.net.\tIN\t TXT" spanID=846ed9332c131ab1 traceID=efeb68ffb83da1ed5362b57509568981
Apr 01 12:06:42 mail dockerd[2352174]: time="2024-04-01T12:06:42.545918098+02:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.3:39748" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.3:39748->172.22.1.254:53: read: connection refused" question=";current.cvd.clamav.net.\tIN\t TXT" spanID=cd72e472b9f3cdb5 traceID=f7339a81051c257e4dc10edea21e0037
Apr 01 12:06:42 mail dockerd[2352174]: time="2024-04-01T12:06:42.546740518+02:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.3:48636" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.3:48636->172.22.1.254:53: read: connection refused" question=";current.cvd.clamav.net.\tIN\t TXT" spanID=9810963c5c9d81c5 traceID=2b2185cc5c91224124cd0eb2e6467d61

@dogsbody
Copy link
Author

dogsbody commented Apr 10, 2024

It seems others are having this issue as well :-(

I believe that those of us on Docker v26.0 no longer have DNS RBL protection for our mailcow instances.

@DerLinkman
Copy link
Member

What does Postfix's Logs say? If there are sections from Spamhaus regarding: listed on 127.0.0.X it is working as expected.

@Cisco30
Copy link

Cisco30 commented Apr 11, 2024

actually I see these entries in the postfix log...


postfix-mailcow-1  | Apr 11 07:18:39 70aa683f8222 postfix/smtps/smtpd[630]: timeout after AUTH from unknown[80.244.11.199]
postfix-mailcow-1  | Apr 11 07:18:39 70aa683f8222 postfix/smtps/smtpd[630]: disconnect from unknown[80.244.11.199] ehlo=1 auth=0/1 rset=1 commands=2/3
postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 postfix/postscreen[648]: CONNECT from [80.94.92.112]:59173 to [172.22.1.253]:25
postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 whitelist_forwardinghosts: Look up 80.94.92.112 on whitelist, result 200 DUNNO
postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[658]: addr 80.94.92.112 listed by domain bl.mailspike.net as 127.0.0.2
postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[661]: addr 80.94.92.112 listed by domain zen.spamhaus.org as 127.0.0.9
postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[661]: addr 80.94.92.112 listed by domain zen.spamhaus.org as 127.0.0.4
postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[661]: addr 80.94.92.112 listed by domain zen.spamhaus.org as 127.0.0.2
postfix-mailcow-1  | Apr 11 07:19:18 70aa683f8222 postfix/dnsblog[666]: addr 80.94.92.112 listed by domain hostkarma.junkemailfilter.com as 127.0.0.2
postfix-mailcow-1  | Apr 11 07:19:20 70aa683f8222 postfix/postscreen[648]: DNSBL rank 17 for [80.94.92.112]:59173
postfix-mailcow-1  | Apr 11 07:19:20 70aa683f8222 postfix/postscreen[648]: HANGUP after 0.14 from [80.94.92.112]:59173 in tests after SMTP handshake
postfix-mailcow-1  | Apr 11 07:19:20 70aa683f8222 postfix/postscreen[648]: DISCONNECT [80.94.92.112]:59173
postfix-mailcow-1  | Apr 11 07:19:25 70aa683f8222 postfix/dnsblog[664]: warning: dnsblog_query: lookup error for DNS query 112.92.94.80.dnsbl.sorbs.net: Host or domain name not found. Name service error for name=112.92.94.80.dnsbl.sorbs.net type=A: Host not found, try again

@DerLinkman
Copy link
Member

actually I see these entries in the postfix log...




postfix-mailcow-1  | Apr 11 07:18:39 70aa683f8222 postfix/smtps/smtpd[630]: timeout after AUTH from unknown[80.244.11.199]

postfix-mailcow-1  | Apr 11 07:18:39 70aa683f8222 postfix/smtps/smtpd[630]: disconnect from unknown[80.244.11.199] ehlo=1 auth=0/1 rset=1 commands=2/3

postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 postfix/postscreen[648]: CONNECT from [80.94.92.112]:59173 to [172.22.1.253]:25

postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 whitelist_forwardinghosts: Look up 80.94.92.112 on whitelist, result 200 DUNNO

postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[658]: addr 80.94.92.112 listed by domain bl.mailspike.net as 127.0.0.2

postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[661]: addr 80.94.92.112 listed by domain zen.spamhaus.org as 127.0.0.9

postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[661]: addr 80.94.92.112 listed by domain zen.spamhaus.org as 127.0.0.4

postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[661]: addr 80.94.92.112 listed by domain zen.spamhaus.org as 127.0.0.2

postfix-mailcow-1  | Apr 11 07:19:18 70aa683f8222 postfix/dnsblog[666]: addr 80.94.92.112 listed by domain hostkarma.junkemailfilter.com as 127.0.0.2

postfix-mailcow-1  | Apr 11 07:19:20 70aa683f8222 postfix/postscreen[648]: DNSBL rank 17 for [80.94.92.112]:59173

postfix-mailcow-1  | Apr 11 07:19:20 70aa683f8222 postfix/postscreen[648]: HANGUP after 0.14 from [80.94.92.112]:59173 in tests after SMTP handshake

postfix-mailcow-1  | Apr 11 07:19:20 70aa683f8222 postfix/postscreen[648]: DISCONNECT [80.94.92.112]:59173

postfix-mailcow-1  | Apr 11 07:19:25 70aa683f8222 postfix/dnsblog[664]: warning: dnsblog_query: lookup error for DNS query 112.92.94.80.dnsbl.sorbs.net: Host or domain name not found. Name service error for name=112.92.94.80.dnsbl.sorbs.net type=A: Host not found, try again



Yeah that is the expected behaviour.

@mrclschstr
Copy link

mrclschstr commented Apr 11, 2024

I'm following this issue with interest, but I'm not sure what the status is now. There are error messages in the journal, but the DNS blocklists still work? I'm a bit confused...

I have seen that there is now a v26.0.1 release that has changed something in the DNS resolution again:

Does that change anything?

EDIT: Sorry, I'm just now realizing that the changes only affect ipvlan interfaces. I assume that the v26.0.1 update will not change anything.

@dogsbody
Copy link
Author

Confirmed. I updated to v26.0.1 last night and got all the same errors overnight :-(

@kilo666mj
Copy link

kilo666mj commented Apr 12, 2024

Digging into this a little, I noticed that the error happens on the initial connect and not during the dnsbl lookup. It seems that when it cannot complete a reverse lookup (Doesn't happen all the time) it produces the error from docker we are seeing about a timeout.

If you have verbose logging on unbound eventually it produces an error like the following:
unbound: [3004587:1] error: SERVFAIL <49.174.15.111.in-addr.arpa. PTR IN>: all servers for this domain failed, at zone 174.15.111.in-addr.arpa. from (inet_ntop_error) upstream server timeout

So, this seems to be an upstream DNS failure that is now being reported differently by docker v26.

EDIT:
Hmm, actually I was able to reproduce the error with docker v25. But the error message did change:

Docker v26:
dockerd[4185627]: time="2024-04-12T10:50:57.097091978+02:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.14:33743" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.14:33743->172.22.1.254:53: i/o timeout" question=";49.174.15.111.in-addr.arpa.\tIN\t PTR" spanID=43318bd4d322b968 traceID=bd4acefe2bd82f26128ec09812f4a5f2
Docker v25:
dockerd[5204]: time="2024-04-12T11:05:51.129790187+02:00" level=error msg="[resolver] failed to query DNS server: 172.22.1.254:53, query: ;49.174.15.111.in-addr.arpa.\tIN\t PTR" error="read udp 172.22.1.14:52220->172.22.1.254:53: i/o timeout

@mrclschstr
Copy link

mrclschstr commented Apr 13, 2024

I can confirm the tests of @kilo666mj with Docker v25. What I still wonder: Are the error messages now works-as-designed or is there really a bug here?

@dogsbody
Copy link
Author

Interestingly, since upgrading from Docker v26.0.0 to v26.0.1 I have also started getting the additional error...
dockerd[468953]: time="2024-04-16T02:55:02.478443307Z" level=info msg="No non-localhost DNS nameservers are left in resolv.conf. Using default external servers"

@dogsbody
Copy link
Author

We have come to the conclusion that nothing is actually broken. Docker is now just being more verbose about DNS entries that don't resolve (NXDOMAIN).

We have done tests from both inside and out of the docker containers and DNS seems to be looking up fine. It's only DNS lookups that result in a NXDOMAIN that produce the log.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants