Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

don't close established connections on Listener.Close, when using a Transport #4072

Merged
merged 2 commits into from Oct 27, 2023

Conversation

marten-seemann
Copy link
Member

@marten-seemann marten-seemann commented Sep 6, 2023

Part of #3962.

When Listener.Close is called, now:

  • Accept will return ErrServerClosed as soon as all connections in the accept queue have been accepted, and
  • QUIC handshakes that are still in flight will be rejected with a CONNECTION_REFUSED error.

Closing the listener doesn't have any effect on already established connections anmore.

@marten-seemann marten-seemann force-pushed the server-close branch 2 times, most recently from 3248e37 to ccff7b9 Compare September 25, 2023 11:25
@marten-seemann marten-seemann force-pushed the server-close branch 2 times, most recently from ac87a49 to 2382f12 Compare October 21, 2023 07:48
@marten-seemann marten-seemann marked this pull request as ready for review October 21, 2023 07:49
@marten-seemann marten-seemann force-pushed the server-close branch 3 times, most recently from 503a84c to a550063 Compare October 21, 2023 08:20
@codecov
Copy link

codecov bot commented Oct 21, 2023

Codecov Report

Merging #4072 (8f0b452) into master (a263164) will increase coverage by 0.02%.
Report is 10 commits behind head on master.
The diff coverage is 100.00%.

❗ Current head 8f0b452 differs from pull request most recent head 5f2f371. Consider uploading reports for the commit 5f2f371 to get more accurate results

@@            Coverage Diff             @@
##           master    #4072      +/-   ##
==========================================
+ Coverage   83.71%   83.73%   +0.02%     
==========================================
  Files         150      150              
  Lines       15415    15397      -18     
==========================================
- Hits        12904    12892      -12     
+ Misses       2012     2008       -4     
+ Partials      499      497       -2     
Files Coverage Δ
packet_handler_map.go 83.23% <ø> (-1.30%) ⬇️
server.go 76.99% <100.00%> (+0.37%) ⬆️
transport.go 72.35% <100.00%> (-0.09%) ⬇️

... and 1 file with indirect coverage changes

@marten-seemann marten-seemann force-pushed the server-close branch 2 times, most recently from 4eba196 to 402234a Compare October 21, 2023 09:04
server.go Show resolved Hide resolved
transport.go Show resolved Hide resolved
transport.go Show resolved Hide resolved
@marten-seemann marten-seemann changed the title don't close established connections on Listener.Close don't close established connections on Listener.Close, when using a Transport Oct 27, 2023
@marten-seemann marten-seemann merged commit dda63b9 into master Oct 27, 2023
32 checks passed
@marten-seemann marten-seemann deleted the server-close branch October 27, 2023 06:11
bassosimone added a commit to ooni/probe-cli that referenced this pull request Dec 13, 2023
Upgrade to quic-go@v0.40.1 and adapt closing policy after [changes in
quic-go@v0.40.0](https://github.com/quic-go/quic-go/releases/tag/v0.40.0)
and quic-go/quic-go#4072 in particular. We're
creating an UDP conn and [passing it to
server.Serve](https://github.com/ooni/probe-cli/pull/1428/files#diff-efda3daa51e9aed0b3444a327e64b7e5c412938a1fe894a3c850d533179c2425R105),
which [calls
serveConn](https://github.com/quic-go/quic-go/blob/v0.40.1/http3/server.go#L242),
which [calls
quicListen](https://github.com/quic-go/quic-go/blob/v0.40.1/http3/server.go#L316)
and then
[ServeListener](https://github.com/quic-go/quic-go/blob/v0.40.1/http3/server.go#L321).
In turn, ServeListener [is interrupted by
ErrServerClosed](https://github.com/quic-go/quic-go/blob/v0.40.1/http3/server.go#L268),
which seems to be generated by [server.Close calling Close for each
listener](https://github.com/quic-go/quic-go/blob/v0.40.1/http3/server.go#L657).
The following is what happened before updating the shutdown protocol:

```
goroutine 247 [sync.Mutex.Lock]:

[...]

github.com/quic-go/quic-go.(*Transport).closeServer(0x1400054a000?)
	/Users/sbs/go/pkg/mod/github.com/quic-go/quic-go@v0.40.0/transport.go:298 +0x90

github.com/quic-go/quic-go.(*baseServer).close.func1()
	/Users/sbs/go/pkg/mod/github.com/quic-go/quic-go@v0.40.0/server.go:344 +0x84

[...]

github.com/quic-go/quic-go.(*baseServer).close(0x14000cc1c10?, {0x102faa7a0?, 0x140002557b0?}, 0xec?)
	/Users/sbs/go/pkg/mod/github.com/quic-go/quic-go@v0.40.0/server.go:338 +0x64

github.com/quic-go/quic-go.(*baseServer).Close(...)
	/Users/sbs/go/pkg/mod/github.com/quic-go/quic-go@v0.40.0/server.go:333

github.com/quic-go/quic-go.(*EarlyListener).Close(0x14000120700?)
	/Users/sbs/go/pkg/mod/github.com/quic-go/quic-go@v0.40.0/server.go:165 +0x34

github.com/quic-go/quic-go/http3.(*Server).Close(0x140007344d0)
	/Users/sbs/go/pkg/mod/github.com/quic-go/quic-go@v0.40.0/http3/server.go:657 +0xe8

github.com/ooni/probe-cli/v3/internal/netemx.(*http3Server).Close(0x140001345a0)
	/Users/sbs/src/github.com/ooni/probe-cli/internal/netemx/http3.go:66 +0xe4

github.com/ooni/probe-cli/v3/internal/netemx.(*QAEnv).Close.func1()
	/Users/sbs/src/github.com/ooni/probe-cli/internal/netemx/qaenv.go:291 +0x4c

[...]

github.com/ooni/probe-cli/v3/internal/netemx.(*QAEnv).Close(0x140001e5570?)
	/Users/sbs/src/github.com/ooni/probe-cli/internal/netemx/qaenv.go:288 +0x48

goroutine 136 [sync.Mutex.Lock]:

[...]

github.com/quic-go/quic-go.(*baseServer).close(0x14000127200?, {0x102faa7a0?, 0x140001920f0?}, 0xd8?)
	/Users/sbs/go/pkg/mod/github.com/quic-go/quic-go@v0.40.0/server.go:338 +0x64

github.com/quic-go/quic-go.(*Transport).close(0x1400054a000, {0x102faa7a0, 0x140001920f0})
	/Users/sbs/go/pkg/mod/github.com/quic-go/quic-go@v0.40.0/transport.go:325 +0x110

github.com/quic-go/quic-go.(*Transport).listen(0x1400054a000, {0x102fbd3a8, 0x14000526108})
	/Users/sbs/go/pkg/mod/github.com/quic-go/quic-go@v0.40.0/transport.go:358 +0x364

created by github.com/quic-go/quic-go.(*Transport).init.func1
	/Users/sbs/go/pkg/mod/github.com/quic-go/quic-go@v0.40.0/transport.go:242 +0x45c
```

Looking at the above traces, both end up at server.go:338 locking the
same sync.Once. It seems the structures are such that attempting to
close both the server and the listener leads to self locking in
v0.40.{0,1}. The original expectation was that the server closed the
connection used for listening anyway, and
ooni/probe#2527 documented how that was not
the case. It seems that now this is the case, so we can comment out the
original ooni/probe#2527 fix without any test
hangs. Also, if the original bug was indeed that the server did not own
the listener, and considering that now it seems the server owns the
listener, it makes sense that the fix for v0.40.1 is to revert the
ooni/probe#2527 fix.

See ooni/probe#2556
Murphy-OrangeMud pushed a commit to Murphy-OrangeMud/probe-cli that referenced this pull request Feb 13, 2024
Upgrade to quic-go@v0.40.1 and adapt closing policy after [changes in
quic-go@v0.40.0](https://github.com/quic-go/quic-go/releases/tag/v0.40.0)
and quic-go/quic-go#4072 in particular. We're
creating an UDP conn and [passing it to
server.Serve](https://github.com/ooni/probe-cli/pull/1428/files#diff-efda3daa51e9aed0b3444a327e64b7e5c412938a1fe894a3c850d533179c2425R105),
which [calls
serveConn](https://github.com/quic-go/quic-go/blob/v0.40.1/http3/server.go#L242),
which [calls
quicListen](https://github.com/quic-go/quic-go/blob/v0.40.1/http3/server.go#L316)
and then
[ServeListener](https://github.com/quic-go/quic-go/blob/v0.40.1/http3/server.go#L321).
In turn, ServeListener [is interrupted by
ErrServerClosed](https://github.com/quic-go/quic-go/blob/v0.40.1/http3/server.go#L268),
which seems to be generated by [server.Close calling Close for each
listener](https://github.com/quic-go/quic-go/blob/v0.40.1/http3/server.go#L657).
The following is what happened before updating the shutdown protocol:

```
goroutine 247 [sync.Mutex.Lock]:

[...]

github.com/quic-go/quic-go.(*Transport).closeServer(0x1400054a000?)
	/Users/sbs/go/pkg/mod/github.com/quic-go/quic-go@v0.40.0/transport.go:298 +0x90

github.com/quic-go/quic-go.(*baseServer).close.func1()
	/Users/sbs/go/pkg/mod/github.com/quic-go/quic-go@v0.40.0/server.go:344 +0x84

[...]

github.com/quic-go/quic-go.(*baseServer).close(0x14000cc1c10?, {0x102faa7a0?, 0x140002557b0?}, 0xec?)
	/Users/sbs/go/pkg/mod/github.com/quic-go/quic-go@v0.40.0/server.go:338 +0x64

github.com/quic-go/quic-go.(*baseServer).Close(...)
	/Users/sbs/go/pkg/mod/github.com/quic-go/quic-go@v0.40.0/server.go:333

github.com/quic-go/quic-go.(*EarlyListener).Close(0x14000120700?)
	/Users/sbs/go/pkg/mod/github.com/quic-go/quic-go@v0.40.0/server.go:165 +0x34

github.com/quic-go/quic-go/http3.(*Server).Close(0x140007344d0)
	/Users/sbs/go/pkg/mod/github.com/quic-go/quic-go@v0.40.0/http3/server.go:657 +0xe8

github.com/ooni/probe-cli/v3/internal/netemx.(*http3Server).Close(0x140001345a0)
	/Users/sbs/src/github.com/ooni/probe-cli/internal/netemx/http3.go:66 +0xe4

github.com/ooni/probe-cli/v3/internal/netemx.(*QAEnv).Close.func1()
	/Users/sbs/src/github.com/ooni/probe-cli/internal/netemx/qaenv.go:291 +0x4c

[...]

github.com/ooni/probe-cli/v3/internal/netemx.(*QAEnv).Close(0x140001e5570?)
	/Users/sbs/src/github.com/ooni/probe-cli/internal/netemx/qaenv.go:288 +0x48

goroutine 136 [sync.Mutex.Lock]:

[...]

github.com/quic-go/quic-go.(*baseServer).close(0x14000127200?, {0x102faa7a0?, 0x140001920f0?}, 0xd8?)
	/Users/sbs/go/pkg/mod/github.com/quic-go/quic-go@v0.40.0/server.go:338 +0x64

github.com/quic-go/quic-go.(*Transport).close(0x1400054a000, {0x102faa7a0, 0x140001920f0})
	/Users/sbs/go/pkg/mod/github.com/quic-go/quic-go@v0.40.0/transport.go:325 +0x110

github.com/quic-go/quic-go.(*Transport).listen(0x1400054a000, {0x102fbd3a8, 0x14000526108})
	/Users/sbs/go/pkg/mod/github.com/quic-go/quic-go@v0.40.0/transport.go:358 +0x364

created by github.com/quic-go/quic-go.(*Transport).init.func1
	/Users/sbs/go/pkg/mod/github.com/quic-go/quic-go@v0.40.0/transport.go:242 +0x45c
```

Looking at the above traces, both end up at server.go:338 locking the
same sync.Once. It seems the structures are such that attempting to
close both the server and the listener leads to self locking in
v0.40.{0,1}. The original expectation was that the server closed the
connection used for listening anyway, and
ooni/probe#2527 documented how that was not
the case. It seems that now this is the case, so we can comment out the
original ooni/probe#2527 fix without any test
hangs. Also, if the original bug was indeed that the server did not own
the listener, and considering that now it seems the server owns the
listener, it makes sense that the fix for v0.40.1 is to revert the
ooni/probe#2527 fix.

See ooni/probe#2556
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants