Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

h3 URLs no longer work #6335

Closed
4 tasks done
timkgh opened this issue Oct 20, 2023 · 25 comments
Closed
4 tasks done

h3 URLs no longer work #6335

timkgh opened this issue Oct 20, 2023 · 25 comments
Assignees
Labels
bug compatibility external libs Issues that require changes in external libraries. P3: Medium
Milestone

Comments

@timkgh
Copy link

timkgh commented Oct 20, 2023

Prerequisites

Platform (OS and CPU architecture)

Linux, mips64_softfloat

Installation

GitHub releases or script from README

Setup

On a router, DHCP is handled by the router

AdGuard Home version

v0.108.0-b.48

Action

h3 URLs no longer work and there is no indication in the logs they are broken.

Config:

h3://1.1.1.2/dns-query
[/example.com/]h3://8.8.8.8/dns-query

Errors:

image

Expected result

h3 URLs should continue to work or be updated to https and please notify the user, log an error message, etc.

Not sure how long this has been broken for.

Actual result

h3 URLs do not work.

Additional information and/or screenshots

Please log error messages. I use a mix of DNS53 and h3 URLs and only discovered in the UI by accident that h3 doesn't work anymore. Not sure when it stopped working.

@ainar-g
Copy link
Contributor

ainar-g commented Oct 23, 2023

We cannot reproduce this. What do the verbose logs show when you test the upstreams?

@ainar-g ainar-g added the waiting for data Waiting for users to provide more data. label Oct 23, 2023
@timkgh
Copy link
Author

timkgh commented Oct 23, 2023

Not sure how to read the error, does it try to bind to IPv6? I do not use IPv6.

INTERNAL_ERROR (local): write udp [::]:55541->8.8.8.8:443: sendmsg: invalid argument

Oct 23 08:58:10 router AdGuardHome[24249]: 2023/10/23 08:58:10.070243 24249#1825 [debug] bootstrap: dialing 8.8.8.8:443 (1/1)
Oct 23 08:58:10 router AdGuardHome[24249]: 2023/10/23 08:58:10.070910 24249#1825 [debug] bootstrap: connection to 8.8.8.8:443 succeeded in 205.154µs
Oct 23 08:59:11 router AdGuardHome[24249]: 2023/10/23 08:59:11.196198 24249#2029 [debug] dnsforward: checking if upstream "h3://8.8.8.8/dns-query" works
Oct 23 08:59:11 router AdGuardHome[24249]: 2023/10/23 08:59:11.197360 24249#2029 [debug] bootstrap: dialing 8.8.8.8:443 (1/1)
Oct 23 08:59:11 router AdGuardHome[24249]: 2023/10/23 08:59:11.198234 24249#2029 [debug] bootstrap: connection to 8.8.8.8:443 succeeded in 209.845µs
Oct 23 08:59:11 router AdGuardHome[24249]: 2023/10/23 08:59:11.199118 24249#2029 [debug] https://8.8.8.8:443/dns-query: sending request over udp: A test.
Oct 23 08:59:11 router AdGuardHome[24249]: 2023/10/23 08:59:11.290027 24249#2029 [debug] https://8.8.8.8:443/dns-query: response received over udp: requesting https://8.8.8.8:443/dns-query: Get_0rtt "https://8.8.8.8:443/dns-query?dns=AAABAAABAAAAAAAABHRlc3QAAAEAAQ": INTERNAL_ERROR (local): write udp [::]:46828->8.8.8.8:443: sendmsg: invalid argument
Oct 23 08:59:11 router AdGuardHome[24249]: 2023/10/23 08:59:11.290407 24249#2029 [debug] re-creating the http client due to requesting https://8.8.8.8:443/dns-query: Get_0rtt "https://8.8.8.8:443/dns-query?dns=AAABAAABAAAAAAAABHRlc3QAAAEAAQ": INTERNAL_ERROR (local): write udp [::]:46828->8.8.8.8:443: sendmsg: invalid argument
Oct 23 08:59:11 router AdGuardHome[24249]: 2023/10/23 08:59:11.290839 24249#2029 [debug] bootstrap: dialing 8.8.8.8:443 (1/1)
Oct 23 08:59:11 router AdGuardHome[24249]: 2023/10/23 08:59:11.291495 24249#2029 [debug] bootstrap: connection to 8.8.8.8:443 succeeded in 193.51µs

@timkgh
Copy link
Author

timkgh commented Oct 23, 2023

If I use https URLs instead and set use_http3_upstreams: true I don't get any errors, but I don't think HTTP/3 works either.

using HTTP/2 for this upstream: HTTP3 support is not enabled

Oct 23 09:13:03 router AdGuardHome[25072]: 2023/10/23 09:13:03.894779 25072#234 [debug] creating a new http client
Oct 23 09:13:03 router AdGuardHome[25072]: 2023/10/23 09:13:03.895258 25072#234 [debug] using HTTP/2 for this upstream: HTTP3 support is not enabled
Oct 23 09:13:03 router AdGuardHome[25072]: 2023/10/23 09:13:03.895863 25072#234 [debug] https://1.1.1.2:443/dns-query: sending request over tcp: A test.

Something seems to be broken with HTTP/3.

@ainar-g
Copy link
Contributor

ainar-g commented Oct 23, 2023

sendmsg: invalid argument looks like the same error as in #6301. Unless you're using an older version on FreeBSD, this is rather weird, but could also be an issue with quic-go on Linux/MIPS. Does rolling back to v0.107.39 fix the QUIC issue?

@timkgh
Copy link
Author

timkgh commented Oct 23, 2023

I'm using v0.108.0-b.48 on mips64_softfloat (EdgeOS, Edgerouter ER-12)
Which 108 version should I try?

@ainar-g
Copy link
Contributor

ainar-g commented Oct 23, 2023

b.46 should contain an older version of quic-go.

Before you do that, you can also try launching AGH with QUIC_GO_DISABLE_ECN=true. It seems like ECN support is the feature that likely broke the FreeBSD one last time.

@timkgh
Copy link
Author

timkgh commented Oct 23, 2023

I'll try it. Any explanation why there are no errors when using https with use_http3_upstreams: true? Isn't that equivalent to using h3?

@ainar-g
Copy link
Contributor

ainar-g commented Oct 23, 2023

Isn't that equivalent to using h3?

Not really. h3 is a way to force HTTP/3 immediately, while use_http3_upstreams merely allows https upstreams to upgrade to HTTP/3 when it proves faster.

@timkgh
Copy link
Author

timkgh commented Oct 23, 2023

Setting QUIC_GO_DISABLE_ECN=true makes the errors go away in b.48

@ainar-g
Copy link
Contributor

ainar-g commented Oct 23, 2023

Thanks for testing!

@marten-seemann, seems like some assumptions about the ECN control messages on Linux may be wrong as well. Perhaps it's because of sizeof(int), or maybe not all Linux kernels support it? Is there anything that can be fixed/patched on quic-go side?

@ainar-g ainar-g added bug P3: Medium compatibility external libs Issues that require changes in external libraries. and removed waiting for data Waiting for users to provide more data. labels Oct 23, 2023
@ainar-g ainar-g self-assigned this Oct 23, 2023
@ainar-g ainar-g added this to the v0.107.41 milestone Oct 23, 2023
@marten-seemann
Copy link

Any chance you could try out what the right size for the control message would be on that system?

@ainar-g
Copy link
Contributor

ainar-g commented Oct 23, 2023

@marten-seemann, assuming that it is really a size problem, I could probably make a test program with various sizes a bit later. Another issue that could cause this is endianness, I think.

@ainar-g
Copy link
Contributor

ainar-g commented Oct 23, 2023

@marten-seemann, just to clarify, where did the number 4 come from? I've been looking through documentation, and it seems like an IP_TOS message on Linux is supposed to be one byte. See man 7 ip, which says TOS is a byte. as well as this package used by Tailscale.

@timkgh
Copy link
Author

timkgh commented Oct 23, 2023

FWIW, my router runs an old kernel (EdgeOS v2.0.9-hotfix.7 on Edgerouter ER-12):
Linux router 4.9.79-UBNT #1 SMP Thu Jun 15 11:34:36 UTC 2023 mips64 GNU/Linux
The AGH arch that I install is mips64_softfloat

@marten-seemann
Copy link

@marten-seemann, just to clarify, where did the number 4 come from? I've been looking through documentation, and it seems like an IP_TOS message on Linux is supposed to be one byte. See man 7 ip, which says TOS is a byte. as well as this package used by Tailscale.

I tested it using https://gist.github.com/marten-seemann/31f20a7da8e6ad69c950af93f6d9a55b. Looks like on Linux it works with both 1 and 4 (Ubuntu VM, arm64). I also tested what happens if I serialize the ECN cmsg first, and then the GSO cmsg (to see if it's because the remainder is cut off), and it still works. Can you make sense of this?

@ainar-g
Copy link
Contributor

ainar-g commented Oct 24, 2023

@marten-seemann, that C code gives me sendmsg: Invalid argument if I use 4 instead of the computed 1, but not always. Try:

for i in $( seq 100 ) ; do printf "${i}: $( ./ecn.tmp.bin )\n" ; done

For me it's approximately every fourth or fifth call.

That isn't the case when 1 is used.

@marten-seemann
Copy link

Interesting. It works on my machine. Presumably some platform-dependence?

I created quic-go/quic-go#4127, would you mind trying it out?

adguard pushed a commit that referenced this issue Oct 24, 2023
Updates #6335.

Squashed commit of the following:

commit b76b558
Merge: dda2c9f e3cc3b0
Author: Ainar Garipov <A.Garipov@AdGuard.COM>
Date:   Tue Oct 24 19:43:03 2023 +0300

    Merge branch 'master' into 6335-try-ecn-fix

commit dda2c9f
Author: Ainar Garipov <A.Garipov@AdGuard.COM>
Date:   Tue Oct 24 19:32:32 2023 +0300

    all: upd quic-go
@ainar-g
Copy link
Contributor

ainar-g commented Oct 24, 2023

@timkgh, can you try v0.108.0-a.739+f6e34ade on the Edge channel without the env? It contains the patch from @marten-seemann.

@timkgh
Copy link
Author

timkgh commented Oct 24, 2023

can you try v0.108.0-a.739+f6e34ade on the Edge channel without the env

How do I download the executable or tar? I don't use the updater, I manually put it on the router.

@ainar-g
Copy link
Contributor

ainar-g commented Oct 24, 2023

@timkgh, the links for all platforms and release channels are on our Wiki. (Make sure that you choose the correct MIPS variant.)

@timkgh
Copy link
Author

timkgh commented Oct 24, 2023

v0.108.0-a.739+f6e34ade works! Thank you.

@ainar-g
Copy link
Contributor

ainar-g commented Oct 24, 2023

@marten-seemann, seems like the fix works. Can we expect v0.39.2 soon?

@marten-seemann
Copy link

Done: https://github.com/quic-go/quic-go/releases/tag/v0.39.2

@ainar-g
Copy link
Contributor

ainar-g commented Oct 26, 2023

@timkgh, the final fix should be in the latest Edge releases.

@ainar-g ainar-g closed this as completed Oct 31, 2023
@ainar-g ainar-g modified the milestones: v0.107.42, v0.107.41 Nov 13, 2023
annguyen0 pushed a commit to annguyen0/AdGuardHome that referenced this issue Nov 27, 2023
Updates AdguardTeam#6335.

Squashed commit of the following:

commit b76b558
Merge: dda2c9f e3cc3b0
Author: Ainar Garipov <A.Garipov@AdGuard.COM>
Date:   Tue Oct 24 19:43:03 2023 +0300

    Merge branch 'master' into 6335-try-ecn-fix

commit dda2c9f
Author: Ainar Garipov <A.Garipov@AdGuard.COM>
Date:   Tue Oct 24 19:32:32 2023 +0300

    all: upd quic-go
@marten-seemann
Copy link

It looks like this change is causing problems on older Linux kernel versions, see #6422 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug compatibility external libs Issues that require changes in external libraries. P3: Medium
Projects
None yet
Development

No branches or pull requests

4 participants
@marten-seemann @ainar-g @timkgh and others