linkcheck builder: begin using requests.Session functionality during linkchecking #11324

jayaddison · 2023-04-14T12:46:31Z

Is your feature request related to a problem? Please describe.
At the moment, the linkcheck builder performs individual request.get (or similar HTTP request method) operations during linkchecking, without any explicit connection or session pooling.

This may be inefficient, because it seems likely that for many use cases, linkchecking will make multiple requests to the same host (because documentation references are likely to have host-locality).

Describe the solution you'd like
Confirmation that connection pooling is not currently in use would be a good starting point; in other words: we should confirm that linkchecking of multiple URLs on a single host results in multiple TCP connections. Ideally this should be written as a test case.

If we can confirm that the problem exists, then we may be able to use some of the Session object functionality from the requests library that's already in use here to enable connection pooling.

Describe alternatives you've considered
None so far, although open to suggestions (and improvements on the definition of this feature request).

Additional context

See note / suggestion about use of context managers: linkcheck builder: streamed HTTP requests do not close their local socket #11317 (comment)

The text was updated successfully, but these errors were encountered:

jayaddison · 2023-04-16T09:38:02Z

Confirmation that connection pooling is not currently in use would be a good starting point; in other words: we should confirm that linkchecking of multiple URLs on a single host results in multiple TCP connections. Ideally this should be written as a test case.

I think that the mocket socket-mocking library could be a very nice match for this testing use case.

I'm experimenting with it locally at the moment; initially it seems to be good at mocking the client-side sockets, but when mocking the communication between the linkcheck (client) and test HTTP server, something isn't working as expected.

cc @mindflayer (I'll likely open an issue report or two for mocket when I can figure out what is going on here)

mindflayer · 2023-04-16T10:07:33Z

Remember that using mocket means mocking socket based modules, and it has to be done in the same thread. What you can do, which is best for many testing use-cases, is using its recording mode. You'll end up with a bunch of fixtures that you can add to your repository.
See https://github.com/mindflayer/python-mocket#example-of-how-to-record-real-socket-traffic.

jayaddison · 2023-04-16T11:01:04Z

Thanks! The test HTTP server does run on a separate thread, so that's likely the problem. The recording feature looks perfect to determine whether the use of session-based requests has the intended effect.

mindflayer · 2023-04-16T16:59:37Z

The test HTTP server does run on a separate thread, so that's likely the problem.

That's exactly why I mentioned that. Let me know if you need more support, I am always happy to help.

jayaddison · 2023-04-16T17:10:49Z

Thanks @mindflayer - I'd be grateful for your help.

I'd been planning to try switching to aiohttp as the test HTTP server, with the aim of handling HTTP server-and-client interaction within a single thread (and simplifying socket mocking).

Does that sound like a sensible approach, and/or do you have other suggestions?

mindflayer · 2023-04-16T20:32:04Z

You won't need to reach the actual server when introducing mocket, but aiohttp should work effectively as a client.

jayaddison · 2023-04-16T22:54:08Z

(a shorter summary of a longer draft message) Are you suggesting that instantiating a server is not necessary during these tests?

mindflayer · 2023-04-17T07:30:23Z

Correct, mocket will be responsible to fool the client and provide the response you registered.

jayaddison · 2023-04-17T10:28:19Z

Ok, thanks. If possible I would like to retain the existing client/server communication during the tests, so I'm weighing up the options available.

I ran into problems using the recording mode of mocket with the tests as they exist today: it seems that, the server's use of sockets is interfered with after mocket becomes enabled (I can see the listen, bind, accept calls arriving on MocketSocket, but clearly they are not going to behave correctly for a server).

During that same attempt, there was also a problem with a call to getsockname before the self._address attribute had been assigned; I've temporarily updated this line locally to match this one (and I've read some discussion about this on your issue tracker, although I'm not yet sure whether to report this, particularly if the server-side use case is not supported).

mindflayer · 2023-04-17T10:39:07Z

I am not sure why would you want to use mocket while maintaining a client/server setup.
The main reason for using it is decoupling tests from having to talk to a "real" server (aka mocking). Some mocking libraries do that providing a fake client, mocket allows you to use the original client you'd use in production and it achieves it through monkey-patching the socket module.

jayaddison · 2023-04-17T11:08:14Z

re: socket mocks

Ok, it seems I didn't match the use-case for mocket correctly here, apologies for the distraction.

@mindflayer: I have some ideas about whether it might be able to support mocket recording mode when both client-and-server are within the same process (as here). Is that worth reporting as a feature request, or is it too obscure a use case? I could do some more development testing locally before creating that, to check whether the ideas are valid.

re: connection pool tracing

I have been able to (hackily) confirm the expected result locally by collecting a set of distinct connections retrieved from urllib3.poolmanager.connection_from_url with-and-without using requests.Session -- as expected (and assertable), the connection count is lower with sessions.

This did require use of (upgrade to) an HTTP/1.1 protocol server during testing -- HTTP/1.0 doesn't keep connections open by default, so the connection count does not change under HTTP/1.0.

mindflayer · 2023-04-17T11:15:08Z

Is that worth reporting as a feature request, or is it too obscure a use case?

Feel free to open an issue on mocket side after you've done your evaluation. I'll do my best to understand the context and give you my pov on the topic.

picnixz · 2023-04-17T11:49:36Z

For the linkcheck builder, since each check is a task performed by a HyperlinkAvailabilityCheckWorker and there is a fixed and known number of workers at the beginning of the check, one should:

order the links by domain before submitting them to the workers. That way, we would process links by "chunks" instead of checking them one by one.
each worker has a dedicated session object instance. That way, the number of sessions existing simultaneously is upper-bounded by the number of maximum workers (5 by default)
instead of submitting a single link to a worker, we should split the whole links to check as a list of chunks, each chunk representing a domain with some links to check within that specific domain. Depending on the number of links in the domain, a single worker can either process all the domain, or multiple workers share the burden of processing that domain (possibly all workers). When the tasks for the domain have been submitted, we move to the next domain and do the same with the remaining available workers.

Ideally, if there are a lot of domains but only checking a single link everytime, the chunk approach is not good. In this case, we should do it normally. If there are a lot of links for a single domain, this would reduce the number of sessions being created.

jayaddison · 2023-04-18T09:33:22Z

I have a branch in progress at https://github.com/jayaddison/sphinx/tree/issue-11324/linkcheck-sessioned-requests that:

Updates the test HTTP servers to HTTP/1.1 (enabling connection keepalive, and -- I expect -- more closely resembling the webservers that most Sphinx application instances linkcheck)
Begins using requests.Session functionality
Removes the linkcheck_timeout configuration (see notes below)

My changes seem naive and don't function particularly well when the tests run.

In particular: without adjusting the timeout, many of the tests fail due to timeout errors that appear in the report results:

(           links: line    3) broken    http://localhost:7777/ - HTTPConnectionPool(host='localhost', port=7777): Read timed out. (read timeout=0.05)

With the timeouts removed, then most of the tests succeed, but run extremely slowly (20 seconds or so per unit test).

Running strace on the pytest process shows repeat calls to futex during the waiting time:

futex(0xa5b8d0, FUTEX_WAKE_PRIVATE, 1)  = 1
futex(0xa5b8d8, FUTEX_WAKE_PRIVATE, 1)  = 1
newfstatat(AT_FDCWD, "/tmp/pytest-of-jka/pytest-5/linkcheck/links.rst", {st_mode=S_IFREG|0644, st_size=648, ...}, 0) = 0
getpid()                                = 5395
getpid()                                = 5395
futex(0xa5b8d4, FUTEX_WAKE_PRIVATE, 1)  = 1
futex(0xa5b8d8, FUTEX_WAKE_PRIVATE, 1)  = 1
newfstatat(AT_FDCWD, "/tmp/pytest-of-jka/pytest-5/linkcheck/links.rst", {st_mode=S_IFREG|0644, st_size=648, ...}, 0) = 0

Socket-level networking and syscall tracing aren't things I have much expertise with, but strace often helps me to get a sense for where problems could exist when a userspace program is running into problems.

It's interesting to me that the tests do eventually pass -- that seems to indicate that both consumer (client) and producer (server) threads are co-operating; my guesses are:

that there are some extremely inefficient tight loops that occur along the way that near-block the threads until one or both of them receive enough CPU time to process relevant events
and/or
that the test server's use of a limited number of sockets (todo: how many?) results in resource contention

Perhaps there's something unusual going on in the socket block/timeout configuration, and/or polling.

It feels like it should be possible to get these two threads to co-operate without these performance penalties, and then to restore the timeout configuration to ensure that the tests run quickly (in terms of elapsed time).

jayaddison · 2023-04-18T11:43:12Z

I have a branch in progress at https://github.com/jayaddison/sphinx/tree/issue-11324/linkcheck-sessioned-requests that:
...
With the timeouts removed, then most of the tests succeed, but run extremely slowly (20 seconds or so per unit test).

An additional finding: sending the Connection: close header as part of each response from he test HTTP/1.1 server(s) seems to achieve a large speedup in the unit test duration.

jayaddison · 2023-04-18T13:11:40Z

Could it be that the HTTP requests opened in streaming mode (stream=True) are holding a socket open until timeout? An HTTP/1.1 server would stop sending data, but does not close the connection unless requested by the client (and, to achieve the overall network traffic goals here, we preferably don't want to do that).

jayaddison · 2023-04-18T13:24:51Z

Could it be that the HTTP requests opened in streaming mode (stream=True) are holding a socket open until timeout? An HTTP/1.1 server would stop sending data, but does not close the connection unless requested by the client (and, to achieve the overall network traffic goals here, we preferably don't want to do that).

Disabling streaming requests from the HTTP client-side here doesn't appear to affect the duration for completion of the tests.

picnixz · 2023-04-18T13:37:48Z

Could it be that the HTTP requests opened in streaming mode (stream=True) are holding a socket open until timeout?

Depends on how your server responds. If the server does not close the connection or does not send some EOF sentinel, both connections will hang until timing out (metaphorically speaking, both participants are staring at each other, waiting for the other to do something).

Concerning requests:

If you set stream to True when making a request, Requests cannot release the connection back to the pool unless you consume all the data or call Response.close. This can lead to inefficiency with connections. If you find yourself partially reading request bodies (or not reading them at all) while using stream=True, you should make the request within a with statement to ensure it’s always closed.

By the way, since we want to speed-up tests, can you actually profile what is doing most of the work (e.g., using kernprof and decorate functions with @profile) ?

jayaddison · 2023-04-18T13:58:00Z

Thanks @picnixz - I'll take a look at kernprof soon.

Introducing Python's socketserver.ThreadingMixin to the test HTTP servers with its daemon_threads mode enabled has most of the tests passing within an acceptable duration again (jayaddison@cbc43ac). However, there is one remaining timeout that causes a test failure: test_follows_redirects_on_GET.

>       assert stderr == textwrap.dedent(
            """\
            127.0.0.1 - - [] "HEAD / HTTP/1.1" 405 -
            127.0.0.1 - - [] "GET / HTTP/1.1" 302 -
            127.0.0.1 - - [] "GET /?redirected=1 HTTP/1.1" 204 -
            """,
        )
E       assert '127.0.0.1 - - [] "HEAD / HTTP/1.1" 405 -\n127.0.0.1 - - [] "GET / HTTP/1.1" 302 -\n127.0.0.1 - - [] Request timed out: TimeoutError(\'timed out\')\n127.0.0.1 - - [] "GET /?redirected=1 HTTP/1.1" 204 -\n' == '127.0.0.1 - - [] "HEAD / HTTP/1.1" 405 -\n127.0.0.1 - - [] "GET / HTTP/1.1" 302 -\n127.0.0.1 - - [] "GET /?redirected=1 HTTP/1.1" 204 -\n'
E           127.0.0.1 - - [] "HEAD / HTTP/1.1" 405 -
E           127.0.0.1 - - [] "GET / HTTP/1.1" 302 -
E         + 127.0.0.1 - - [] Request timed out: TimeoutError('timed out')
E           127.0.0.1 - - [] "GET /?redirected=1 HTTP/1.1" 204 -

tests/test_build_linkcheck.py:363: AssertionError

@picnixz

…st_defaults' test case With thanks to @picnixz for mentioning this configuration setting: sphinx-doc#11324 (comment)

jayaddison · 2023-04-18T19:37:18Z

With the timeouts removed, then most of the tests succeed, but run extremely slowly (20 seconds or so per unit test).

FWIW: switching from the http.server.HTTPServer implementation to http.server.ThreadingHTTPServer appeared to clear up the performance issues here.

jayaddison · 2023-04-18T19:40:39Z

each worker has a dedicated session object instance. That way, the number of sessions existing simultaneously is upper-bounded by the number of maximum workers (5 by default)

@picnixz thank you for this suggestion - I've applied it using the conf.py configuration in one of the test cases in the branch I'm working on, where it's configured down to one (1) - that's extremely helpful to assert on the number of expected connections.

I'm a little more reserved/cautious about the sorting and chunking suggestions. Certainly it could make sense in terms of resource-usage to group together similar requests and to make them around the same time.. I think it should be a separate changeset though. Would you like to add any more detail, or should I open that as a follow-up issue?

picnixz · 2023-04-19T10:52:24Z

I'll write a separate issue this weekend. I will be offline for a few days.

jayaddison · 2023-07-23T15:38:22Z

Sorry @AA-Turner - #11402 doesn't complete this, could we reopen it? It's my mistake for using the phrase resolve #11324 in that pull request's description.

jayaddison · 2023-07-23T16:33:53Z

Thanks!

Most of the groundwork should be in place for this migration/feature now, I think.

jayaddison mentioned this issue Apr 14, 2023

linkcheck builder: close streamed HTTP response objects when no content reads are required #11318

Merged

jayaddison added a commit to jayaddison/sphinx that referenced this issue Apr 18, 2023

Unit tests: linkcheck: configure a single linkcheck worker during 'te…

4305d11

…st_defaults' test case With thanks to @picnixz for mentioning this configuration setting: sphinx-doc#11324 (comment)

jayaddison mentioned this issue Apr 18, 2023

linkcheck builder: begin using session-based HTTP requests #11340

Closed

jayaddison mentioned this issue Apr 19, 2023

Question / feature request: using truesocket recording mode for an in-process HTTP server mindflayer/python-mocket#198

Open

picnixz mentioned this issue Apr 21, 2023

[FEATURE] Optimization of linkcheck builder's workers (follow-up #11324) #11346

Open

This was referenced Apr 27, 2023

Flaky unit test: test_auth_header_uses_first_match #11348

Closed

tests: linkcheck: add connection-count measurement #11374

Merged

AA-Turner added this to the some future version milestone Apr 29, 2023

This was referenced May 3, 2023

linkcheck tests: test webservers: enable HTTP/1.1 protocol #11392

Merged

tests: linkcheck: add client connection-pool contention coverage #11402

Merged

jayaddison mentioned this issue May 17, 2023

tests: linkcheck: refactor to remove variable sharing between tests and test webservers #11426

Merged

AA-Turner closed this as completed in #11402 Jul 23, 2023

AA-Turner reopened this Jul 23, 2023

jayaddison mentioned this issue Jul 23, 2023

linkcheck builder: begin using session-based HTTP requests #11503

Merged

AA-Turner closed this as completed in #11503 Jul 23, 2023

jayaddison mentioned this issue Jul 28, 2023

Request for Feedback: linkcheck module #11532

Closed

github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

linkcheck builder: begin using requests.Session functionality during linkchecking #11324

linkcheck builder: begin using requests.Session functionality during linkchecking #11324

jayaddison commented Apr 14, 2023

jayaddison commented Apr 16, 2023

mindflayer commented Apr 16, 2023

jayaddison commented Apr 16, 2023

mindflayer commented Apr 16, 2023 •

edited

jayaddison commented Apr 16, 2023

mindflayer commented Apr 16, 2023

jayaddison commented Apr 16, 2023

mindflayer commented Apr 17, 2023

jayaddison commented Apr 17, 2023

mindflayer commented Apr 17, 2023

jayaddison commented Apr 17, 2023 •

edited

mindflayer commented Apr 17, 2023

picnixz commented Apr 17, 2023

jayaddison commented Apr 18, 2023

jayaddison commented Apr 18, 2023

jayaddison commented Apr 18, 2023

jayaddison commented Apr 18, 2023

picnixz commented Apr 18, 2023

jayaddison commented Apr 18, 2023

jayaddison commented Apr 18, 2023

jayaddison commented Apr 18, 2023

picnixz commented Apr 19, 2023

jayaddison commented Jul 23, 2023

jayaddison commented Jul 23, 2023

linkcheck builder: begin using requests.Session functionality during linkchecking #11324

linkcheck builder: begin using requests.Session functionality during linkchecking #11324

Comments

jayaddison commented Apr 14, 2023

jayaddison commented Apr 16, 2023

mindflayer commented Apr 16, 2023

jayaddison commented Apr 16, 2023

mindflayer commented Apr 16, 2023 • edited

jayaddison commented Apr 16, 2023

mindflayer commented Apr 16, 2023

jayaddison commented Apr 16, 2023

mindflayer commented Apr 17, 2023

jayaddison commented Apr 17, 2023

mindflayer commented Apr 17, 2023

jayaddison commented Apr 17, 2023 • edited

re: socket mocks

re: connection pool tracing

mindflayer commented Apr 17, 2023

picnixz commented Apr 17, 2023

jayaddison commented Apr 18, 2023

jayaddison commented Apr 18, 2023

jayaddison commented Apr 18, 2023

jayaddison commented Apr 18, 2023

picnixz commented Apr 18, 2023

jayaddison commented Apr 18, 2023

jayaddison commented Apr 18, 2023

jayaddison commented Apr 18, 2023

picnixz commented Apr 19, 2023

jayaddison commented Jul 23, 2023

jayaddison commented Jul 23, 2023

mindflayer commented Apr 16, 2023 •

edited

jayaddison commented Apr 17, 2023 •

edited