Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: what do you think of this branch adding async support to urllib3? #1323

Closed
njsmith opened this issue Feb 1, 2018 · 63 comments
Closed

Comments

@njsmith
Copy link
Contributor

njsmith commented Feb 1, 2018

We (mostly @pquentin and I) have been working on a proof of concept for adding pluggable async support to urllib3, with the hope of eventually getting this into the upstream urllib3. It's reached the point where there's still lots of missing bits, but there's an end-to-end demo working and we don't see any major obstacles to getting everything else working, so I wanted to start getting feedback from the urllib3 maintainers about whether this looks like something you'd be interested in eventually merging, and what it would take to get there.

Demo

So: hi! Check it out – a single py2.py3 wheel that keeps the classic synchronous API, and on python 3.6+ it can also run in async mode on both Trio and Twisted:

# Demo from commit 1ca67ee53e18f823d0cb in the python-trio/urllib3 bleach-spike branch
$ curl -O https://vorpus.org/~njs/tmp/async-urllib3-demo.zip
$ unzip async-urllib3-demo.zip
$ cd async-urllib3-demo
$ ls
sync-demo.py
async-demo.py
urllib3-2.0.dev0+bleach.spike.proof.of.concept.dont.use-py2.py3-none-any.whl

$ virtualenv -p python3.6 py36-venv
$ py36-venv/bin/pip install trio twisted[tls] urllib3-2.0.dev0+bleach.spike.proof.of.concept.dont.use-py2.py3-none-any.whl

$ py36-venv/bin/python sync-demo.py
--- urllib3 using synchronous sockets ---
URL: http://httpbin.org/uuid
Status: 200
Data: b'{\n  "uuid": "a2c28245-47b8-4a50-b64c-da09d27bf626"\n}\n'

$ py36-venv/bin/python async-demo.py
--- urllib3 using Trio ---
URL: http://httpbin.org/uuid
Status: 200
Data: b'{\n  "uuid": "dab50c1a-1b20-483f-903e-fe74494629f2"\n}\n'

--- urllib3 using Twisted ---
URL: http://httpbin.org/uuid
Status: 200
Data: b'{\n  "uuid": "72196b66-7caa-40dd-9c7c-af65bb4f7fb6"\n}\n'

$ virtualenv -p python2 py2-venv
$ py2-venv/bin/pip install urllib3-2.0.dev0+bleach.spike.proof.of.concept.dont.use-py2.py3-none-any.whl
$ py2-venv/bin/python sync-demo.py
--- urllib3 using synchronous sockets ---
URL: http://httpbin.org/uuid
Status: 200
Data: '{\n  "uuid": "2a2fce90-d853-4940-b111-0969f92b7678"\n}\n'

Things that (probably) don't work yet include: HTTPS, timeouts, proper connection reuse, using python 2 to run setup.py bdist_wheel, running the test suite. OTOH some non-trivial things do work, like chunked transfer encoding, and there's at least code in there for proxy support and handling early responses from the server.

what sorcery is this

This is based on @Lukasa's "v2" branch, so it's using h11. (Goodbye httplib 👋.) The v2 branch is currently stalled in a pretty half-finished state (e.g. the I/O layer is a bunch of raw select loops open-coded everywhere that IIRC don't work; I'm sure @Lukasa would have cleaned it up if he had more time); we cleaned all that up and made it work. The major new code is here: https://github.com/python-trio/urllib3/blob/bleach-spike/urllib3/_async/connection.py

And then we made the low-level I/O pluggable – there's a small-ish ad hoc API providing the set of I/O operations that urllib3 actually needs, and several implementations using different backends. Note that the I/O backend interface is internal, so we can adjust it as needed; there's no attempt to provide a generic abstract I/O interface that would work for anyone else. The code for the backends is here: https://github.com/python-trio/urllib3/tree/bleach-spike/urllib3/_backends

Then we started adding async/await annotations to the rest of urllib3's code. That gave us a version that could work on Trio and Twisted. But wait, we also want to support python 2! And everyone using the existing synchronous API on python 3 too, for that matter. But we don't want to maintain two copies of the code. Unfortunately, Python absolutely insists that async APIs and synchronous APIs be loaded from different copies of the code; there's no way around this. (Mayyybe if we dropped python 2 support and were willing to maintain a giant shim layer then it would be possible, but I don't think either of those things is true.)

Solution: we maintain one copy of the code – the version with async/await annotations – and then a little script maintains the synchronous copy by automatically stripping them out again. It's not beautiful, but as far as I can tell all the alternatives are worse.

Currently the async version (source of truth) lives in urllib3/_async/, and then setup.py has the code to automatically generate urllib3/_sync/... at build time. (This script should probably get factored out into a separate project.) The script is not at all clever; you can see the exact set of transformations in the file, but basically it just tokenizes the source, deletes async and await tokens, and renames a few other tokens like __aenter____enter__. There's no complicated AST manipulation and comments are preserved. Then urllib3/__init__.py imports things from urllib3._sync and (if it's running on a new enough Python) urllib3._async.

The resulting dev experience is sort of half-way between that of a pure-python project and one with a C extension: instead of an edit/run cycle, you need to do an edit/compile/run cycle. But you still end up with a nice py2.py3-none-any wheel at the end, so you don't need to stress out about providing binary builds for different platforms, and the builds are extremely fast because it's just some shallow text manipulation, not invoking a C compiler.

Oh, and for backcompat we also added some shim files like urllib3/connectionpool.py, that just re-export stuff from the corresponding files in urllib3/_sync/..., since these submodules are documented as part of the API. This also has the benefit of making it easier to avoid accidentally exporting things in the future; urllib3's public API is perhaps larger than it should be.

Importantly, this basic strategy would also work for libraries that use urllib3, so it provides a path forward for higher-level projects like requests, botocore, etc. to provide dual sync/async APIs while supporting python 2 + multiple async backends.

Backwards compatibility

So far this is looking surprisingly good. Switching to h11 means losing most of urllib3.connection, since that's directly exposing httplib APIs. In async mode we can't support lazily loading response.data, but that's fine, and we can still support it in sync mode if we want. Right now the branch is keeping the "close" operations synchronous, but we might want to switch to making them async, because it might make things easier for HTTP/2 later. If we do switch to async close, then that will force some broader changes – in particular RecentlyUsedContainer assumes that it can close things from __setitem__ and __delitem__, which can't be made async. It wouldn't be hard to rewrite RecentlyUsedContainer, but right now technically it's a public API.

I'm not sure about some of the more obscure stuff like urllib3.contrib. In particular urllib3.contrib.pyopenssl, urllib3.contrib.socks, and urllib3.contrib.securetransport seem difficult to handle in a I/O-backend-agnostic way – maybe we could hack them to keep working on the sync backend only? And I don't know what to think about the appengine support. Maybe it's fine because it doesn't actually use urllib3 internals at all?

But overall, my impression is that we could keep quite a high degree of backwards compatibility if we want. Perhaps we'd want to break things for other reasons if we're doing a v2, but that's a separate discussion.

Questions

Basically at this point it seems clear that this approach can work. And what's left is just straightforward work. So:

  • Does this seem like the direction we want urllib3 to go?

  • If so, then how would you like that process to go?

I know @Lukasa is wary of the code generation approach, but the alternatives seem to be (a) maintaining a ton of redundant code, (b) not supporting async in urllib3, in which case the community will... end up maintaining a ton of redundant code as each I/O library separately implements their own fake copy of requests. And I don't believe that we have enough HTTP experts in the community to do that well; I think there are huge benefits to concentrating all our maintainers on a single library if we can.

CC: @kennethreitz @markrwilliams

@njsmith
Copy link
Contributor Author

njsmith commented Feb 1, 2018

I should probably also note that it's easy to add more backends; we only started with trio and twisted because we figured if we could get those working then we were flexible enough to handle anything else. (And we're using twisted instead of asyncio because to handle proxies we need starttls support, which is possible in asyncio right now but requires gross hacks.)

@1st1
Copy link

1st1 commented Feb 2, 2018

And we're using twisted instead of asyncio because to handle proxies we need starttls support, which is possible in asyncio right now but requires gross hacks.

This is all you need to implement starttls on asyncio 3.6 (will probably work on later 3.5 too):

async def start_tls(loop, transport, protocol, sslcontext, *,
                    server_side=False,
                    server_hostname=None):
    """Upgrade transport to TLS.

    Return a new transport that *protocol* should start using
    immediately.
    """
    if ssl is None:
        raise RuntimeError('Python ssl module is not available')

    if not isinstance(sslcontext, ssl.SSLContext):
        raise TypeError(
            f'sslcontext is expected to be an instance of ssl.SSLContext, '
            f'got {sslcontext!r}')

    waiter = loop.create_future()
    ssl_protocol = asyncio.sslproto.SSLProtocol(
        loop, protocol, sslcontext, waiter,
        server_side, server_hostname,
        call_connection_made=False)

    transport.set_protocol(ssl_protocol)
    loop.call_soon(ssl_protocol.connection_made, transport)
    if not transport.is_reading():
        loop.call_soon(transport.resume_reading)

    await waiter
    return ssl_protocol._app_transport

(This is a bit simplified version of my start-tls 3.7 commit: python/cpython@f111b3d)

@1st1
Copy link

1st1 commented Feb 2, 2018

So it would be interesting to see if this works with asyncio too.

@njsmith
Copy link
Contributor Author

njsmith commented Feb 2, 2018

I'm sure it can, though it might be better to wait a bit to make sure the internal APIs have stabilized -- the twisted and asyncio backends will probably be very similar, and the twisted version is by far the most non-trivial (because it has to invert the inversion of control), so maintaining two similar versions for now might be churn-y. OTOH if you want to do the maintaining I guess that'd be fine :-)

@1st1
Copy link

1st1 commented Feb 2, 2018

It's really up to you ;) I just wanted to note that "which is possible in asyncio right now but requires gross hacks" is a bit exaggerated and shouldn't be a showstopper for having asyncio support from the beginning if this gets merged.

@njsmith
Copy link
Contributor Author

njsmith commented Feb 2, 2018

@1st1 yeah, I only knew of the trick aiohttp uses. Does the await waiter there cause it to wait for the handshake to finish?

@1st1
Copy link

1st1 commented Feb 2, 2018

@1st1 yeah, I only knew of the trick aiohttp uses. Does the await waiter there cause it to wait for the handshake to finish?

Yes. I was going to write a few more tests before beta-2 to see if the cancellation is handled correctly (IOW if we need try..except around that await). In any case, if this gets a green light, I'd appreciate this snippet being used to enable asyncio support in urllib3 so that we can test how well it works before 3.7 is released.

@njsmith
Copy link
Contributor Author

njsmith commented Feb 14, 2018

Ping!

@sigmavirus24
Copy link
Contributor

I'm generally wholly in favor of this, but it's harder to review without a pull request, which is why I haven't.

@njsmith
Copy link
Contributor Author

njsmith commented Feb 15, 2018

I'm generally wholly in favor of this

That's great to hear!

it's harder to review without a pull request, which is why I haven't

Well, this is why I'm asking about process above :-). The problem is that this is touching a huge proportion of the code. @Lukasa's v2 branch is already "2,431 additions and 2,193 deletions", and then this is a bunch more stuff on top of that. And unfortunately, the files have to be renamed in a way that confuses git. E.g. urllib3/connection.py is moved to urllib3/_async/connection.py, and then a new urllib3/connection.py is added as a shim to re-export the old API. Apparently if you do this it is simply not possible to keep git diff working. (I tried!) We've managed to preserve git blame, but the renames mean that git diff and Github's comparison view are a mess – you can see it here if you want. And the branch is still broken in lots of ways; currently we're focusing on getting the test harness running with lots of tests marked xfail or skip, so we can start running CI and start ratcheting up the number of passing tests...

Basically this isn't something that can be treated as a simple feature branch. Which is not surprising for a v2 branch, but if we're going to go this way we do need to figure out what our strategy is.

One option would be to keep developing this branch, possibly after merging it to the v2 branch in the main repo so that you all can watch changes going into it. (If we did that then it'd probably be easier if I had a commit bit so I could merge PRs into the v2 branch.) This certainly seems doable. Given the file renames, keeping up with master will be a bit annoying – in particular, it doesn't seem possible to convince git to automatically figure out that changes to urllib3/connectionpool.py, urllib3/poolmanager.py, and urllib3/response.py should be applied to urllib3/_async/*.py instead. OTOH the unmerged changes in master to these files is currently "3 files changed, 55 insertions(+), 35 deletions(-)", so manually merging them doesn't look like it'd be a big deal currently.

Another option would be to throw away the current branch, and start reconstructing the changes as a series of individual PRs on top of master. I'm honestly not sure how hard this would be. The httplib→h11 changes are largely self-contained, so I guess could be reconstructed fairly easily, and the big change moving files around is a big annoying thing, but again easy to reconstruct. It's always going to be the case though that (a) dropping httplib, and (b) making everything async are both going to be giant big-bang changes. It would be very difficult to make these changes as a series of incremental tweaks where everything keeps working after each change.

@njsmith
Copy link
Contributor Author

njsmith commented Feb 19, 2018

Update: I was able to successfully merge from master.

@haikuginger
Copy link
Contributor

Sorry for the late response. I've been pondering this for a while, but should have posted an initial comment sooner.

I'm -1 on the code generation approach. While async IO is gaining in popularity, especially in situations where scalability is paramount, most of our users are still going to be using the synchronous API. By doing code generation, we're demoting that API, which most users use, to being not even present in the canonical source tree.

What's more, it introduces substantial technical debt. Do we expect contributors to regenerate the synchronous API every time they make a change? If so, that leads to a substantially complicated code review process (both because it's an extra step for the contributor and because the maintainer will need to keep an eye out for changes of this kind). Do we expect the maintainer to regenerate it before pushing to PyPI? If so, that's an extra step that could go wrong, and means that someone using urllib3 for the synchronous API can no longer simply clone the repository and use it. And, it introduces a trust issue in that there's an artifact of the release that doesn't exist canonically anywhere else.

I'm +1 on adding an asynchronous API in general. Having a robust async API that can be used by the major packages will likely substantially reduce our support burden.

I'm +1 on moving to H11. This is a change I would like to see as a separate pull request on master, although I probably want to postpone to a future release where it's the only major change to ensure that we provide an easy backwards path for people who may potentially encounter issues with it. For this change, it would be nice if the internal API was built in a way that encourages modularity so that we can easily hook into it to add HTTP/2 support with hyper-h2 later on.

Also on the H11 front, note that this would be our first hard external requirement. I'd like to invite any Requests people (particularly @sigmavirus24, @Lukasa, and @nateprewitt) to weigh in on their experiences as a core library with hard external requirements and suggest ways that we can integrate this requirement in a minimally-disruptive manner to our downstream consumers.

Overall: we need much smaller, self-contained PRs in order to make this succeed. PRs above 500 lines or so become increasingly more difficult to review, and problems will be missed. This may mean adding and figuring out test cases for the underpinnings of the async API before actually creating that API, but it's important to me that the integrity of our code review process remain strong, which means that all code must be thoroughly reviewed.

@njsmith
Copy link
Contributor Author

njsmith commented Feb 21, 2018

@haikuginger Thanks for the reply! I totally understand needing time to think things over for something like this...

This is a long comment so I'm going to use headlines to break it up!

What exactly the technical consequences of this approach would be

Do we expect contributors to regenerate the synchronous API every time they make a change? If so, that leads to a substantially complicated code review process (both because it's an extra step for the contributor and because the maintainer will need to keep an eye out for changes of this kind). Do we expect the maintainer to regenerate it before pushing to PyPI? If so, that's an extra step that could go wrong, and means that someone using urllib3 for the synchronous API can no longer simply clone the repository and use it. And, it introduces a trust issue in that there's an artifact of the release that doesn't exist canonically anywhere else.

The way the prototype works, the synchronous code is auto-generated when running setup.py to install or create a wheel, so there's no extra step during code review (the generated code isn't checked in), and no extra step when pushing to PyPI (just run bdist_wheel, same as before). Also, I'm not sure what trust issue you're thinking of, because the transformation is simple and deterministic (much more so than e.g. running a compiler), and if you don't trust the person uploading the release then they can already introduce arbitrary backdoors. Also, right now the transformation is done directly by code living in the source tree itself, so anyone with a source tree can produce the same release code. I think there's some benefit to splitting this off into an independent library that other projects can use too, but if you want to keep the self-contained source tree then we could always vendor it back.

But you're correct that this would mean giving up on being able to just clone the repository and use it without going through a build step, and this definitely creates some extra friction for development. We can do things to minimize this cost. Right now the branch has a runtests.py script that takes care of the gunk (it rebuilds everything and then calls pytest). I expect we can also get things set up so running tox Just Works. And there are many, many python projects out there that have it worse than this -- in particular, every one that has any kind of extension module. But I agree, this is a real cost that we're asking the urllib3 maintainers/contributors take on.

Tradeoffs and compromises

I'm -1 on the code generation approach. [...]
I'm +1 on adding an asynchronous API in general [...]
I'm +1 on moving to H11 [...]
Overall: we need much smaller, self-contained PRs [...]

I want all these things too. I don't think it's possible to get all of them, and so we'll have to work out some compromise or another.

sync + async

One challenge is supporting sync + async without code generation. I've had many many conversations about this over the last few years, with folks like @Lukasa and @kennethreitz, and it's just a really hard problem. Realistically, the only options anyone has demonstrated are:

  • Code generation (this approach)
  • Some kind of massive code duplication between the sync and async paths

The code duplication can take a few different forms -- you can not support async at all, and then the community ends up supporting multiple independent libraries, splitting maintainer attention (see: aiohttp, treq, asks, ...). It's very unlikely that these can all be maintained to the same high standard that urllib3 has set. Or, you can write everything as confusing callback-based code, restrict yourself to supporting callback-based async libraries (so e.g. not trio), and then maintain a synchronous shim around all of it. Or, you could write async code like in our branch, but instead of doing code generation you could maintain a second synchronous shim around every public API that implements the sync version using the async version. However, aside from being a lot of extra work, this last approach also requires giving up on Python 2 support, which seems premature for urllib3 given that you're upstream of projects like pip and boto.

If there's another option we've missed, then I'd love to know. We only ended up working on the code generation approach after ruling out everything else we could think of. And it's true it has some downsides. But.... it works :-).

Do you have any alternative suggestions?

h11

Now that @Lukasa's no longer at HPE, I don't think anyone is volunteering to do this work unless it also gives us async support (lmk if that's wrong). So while logically this could be done as a standalone feature, I don't think it's viable from a project management perspective?

Re: managing the dependency, when an h11-based urllib3 starts shipping I would definitely suggest vendoring h11 (at least at first), because this will be a huge step function increase in h11's usage, and vendoring gives us more flexiblity if we discover we need to make emergency fixes.

My impression is that HTTP/2 support will require some significant rearchitecting, rather than being something that can easily be hidden behind an abstract HTTP connection interface. Which isn't a problem, it just means that I don't think it's worth spending a lot of time now trying to design an abstract HTTP connection interface given we'll have to redo it later anyway. The things we can do now to prepare for HTTP/2 support are things like, hiding the connection object from the public API, and generally keeping it in mind when designing new public APIs. (E.g., it suggests that maybe connection shutdown should be async.)

Splitting everything up into small PRs

I don't really know how to do this. OTOH, I'm more hopeful that we could figure something out here than that we can avoid code generation :-).

First, a question: all this work is based on @Lukasa's v2 branch in the official repo. That's just focused on moving to h11, and it's incomplete, but there's already a ton of work there: 278 commits, 2431 insertions(+), 2193 deletions(-). Is this something you'd want to see broken up into small PRs as well, or does the fact that it was already done in the main urllib3 repo, at least in part via PRs into the v2 branch, give us some kind of dispensation? Because I understand what we've done on top of that much better than I understand everything @Lukasa did :-). (@Lukasa, if you have any thoughts on how this could change could be managed in a way that won't break all our heads, I'd love to hear them!)

I can imagine splitting this up into a series of PRs into the v2 branch. In fact a lot of it was already done that way, just in this repo rather than the main repo. One possible way forward would be to merge what we have into the v2 branch (either in one big chunk or piecemeal), and then continue to develop it as a series of PRs to that branch, so that the urllib3 maintainers can keep an eye on what's happening as we go.

I guess maybe it is also worth pointing out that if you want to add h11 as a dependency, that by itself effectively means you're importing ~1000 lines of my code without reviewing it, so maybe it's not so bad to trust my reviews of @pquentin's work :-). Obviously it's not quite the same though...

tl;dr

I guess the main question is just, does anyone have any realistic alternatives to code generation? That's the big question everything else hinges on.

@ncoghlan
Copy link

A question regarding the code generation approach: I understood @haikuginger's main concern as being with the source tree containing only the async code by default, and generating the sync code from that, even though it's the async version that's new, and the vast majority of library clients will still be relying on the generated synchronous code.

So my question would be whether or not the code generation approach would be amenable to a comment, decorator, or marker function based solution whereby it was the synchronous code that lived in source control, and the code generator injected async and await keywords in the appropriate places to make the async version. For example:

def api_name(arg): # make-async
    var = get_value() # add-await

Which would become:

async def api_name(arg):
    var = await get_value()

You'd need at least those two directives (convert def/for/with to their async counterparts, add an await to the RHS of an assignment statement), and perhaps some others depending on how the code is structured.

You could potentially even use the current code generator to emit the first version of the annotated synchronous code (by having it add the directive comments when it drops an async or await)

@haikuginger
Copy link
Contributor

I would certainly be more comfortable with an explicit pragma, as suggested by @ncoghlan, which preserves the synchronous API as the first-class citizen. I'd still like to see if we can arrive at a solution that avoids codegen in general, though.

The way I see it is that if the constraints of the Python language syntax make it difficult or impossible to use the same code for both synchronous and asynchronous I/O, then, by default, sync and async versions of a similar function are fundamentally different code, even if textually similar.

If, on the other hand, we believe that sync and async are actually fundamentally similar when the code inside the block is similar, but the Python syntax doesn't allow for the needed modularity, then I would consider this an issue with the syntax itself. IMO, the correct way to fix such an issue (if, in fact, it is one) would not be to work around it in library code, but to file a PEP or create a library that adjusts the Python syntax or provides additional faculties to declare the isomorphism.

I'm going to be spending the next day or so digging into asyncio to make sure I'm able to carry on a competent conversation about it. Thoughts from anyone else who'd be affected by this change are very welcome.

@njsmith
Copy link
Contributor Author

njsmith commented Feb 21, 2018

@ncoghlan It's not clear to me how your pragmas know where to insert the awaits? It's part of the expression syntax, not the statement syntax, so you can have things like print(await get_value()), or even (a random example from treq's test suite) await treq.content(await treq.get(page)).

It seems like we're worried that not enough people know about Python's async/await syntax, so we're considering inventing a new syntax that's more awkward and that nobody knows?

@haikuginger So here's the basic problem.

  • Code using async/await is structured just like regular synchronous code
  • But every function call that ultimately might block or do I/O, has to be written like await foo() instead of foo(). Obviously this affects lots of urllib3's code, in a kind of shallow but pervasive way.
  • Python 2 will raise SyntaxError if it ever sees a .py file that contains an await foo(). That's only supported on 3.5+. (And actually our current code requires 3.6+ to enable async mode, because in 3.5 you can't have a function that's both a generator and also async. But that's fixable if necessary, because generators are syntactic sugar; we could write the iterator explicitly instead.)

So as long as we support Python 2, it seems completely impossible to have a single implementation of urllib3's logic that supports both sync + async without using code generation.

This also makes it hard to fix anything with a PEP, since the only thing a PEP can affect is Python 3.8+.

What if we drop Python 2? That increases our options a little bit, but then we have another problem. In async mode, lots of urllib3's public functions/methods change their signature slightly: they're annotated as async instead of sync. So the extra option we gain is: we can keep everything async internally, but then wrap it in a synchronous shim. But this still effectively means maintaining two copies of our public API, which have to be manually kept in sync (pun not intended).

If it helps, the code generation is about as trivial as code generation can be. It:

  • deletes all async and await keywords (so await foo() becomes foo(), async def foo becomes def foo, async with becomes with, async for becomes for)
  • does some token-wise replacements: __aenter__ -> __enter__, __aexit__ -> __exit__, __aiter__ -> __iter__, __anext__ -> __next__, and StopAsyncIteration -> StopIteration. This allows us to implement async context managers and async iterators, and have them converted into regular synchronous context managers and iterators.
  • that's all

That's why we're inclined to see the sync and async code as being fundamentally similar: literally all of the actual logic and semantics are identical. The only difference is removing this extra bookkeeping that async code requires.

@1st1
Copy link

1st1 commented Feb 21, 2018

This also makes it hard to fix anything with a PEP, since the only thing a PEP can affect is Python 3.8+.

I think there's almost zero chance that a simple PEP-able solution exists for this problem, so I wouldn't expect anything to happen in 3.8.

If it helps, the code generation is about as trivial as code generation can be. It:

  • deletes all async and await keywords (so await foo() becomes foo(), async def foo becomes def foo, async with becomes with, async for becomes for)

I propose a different approach. Instead of changing everything to use async/await and having a tool to convert it to a synchronous code, why don't we design a tool to produce async/await code from synchronous code.

Let's call this new hypothetical tool "mirror". And let's imagine we want to make the following snippet of code sync/async compatible:

def send_request(self, request, read_timeout):
    """
    Given a Request object, performs the logic required to get a response.
    """
    h11_response = _start_http_request(
        request, self._state_machine, self._sock
    )
    return _response_from_h11(h11_response, self)

We can import the mirror package and rewrite the code to:

import mirror

@mirror.coroutine
def send_request(self, request, read_timeout):
    """
    Given a Request object, performs the logic required to get a response.
    """
    h11_response = mirror.wait(_start_http_request(
        request, self._state_machine, self._sock
    ))
    return _response_from_h11(h11_response, self)

Now, for synchronous code, @mirror.coroutine and mirror.wait() are no-ops. Therefore a blocking backwards compatible Python 2 and Python 3 version is available out of the box, no conversion needed.

For asynchronous code, our hypothetical mirror tool would parse the code and transform functions decorated with @mirror.coroutine into async def functions, and mirror.wait into await expressions:

async def send_request(self, request, read_timeout):
    """
    Given a Request object, performs the logic required to get a response.
    """
    h11_response = await (_start_http_request(
        request, self._state_machine, self._sock
    ))
    return _response_from_h11(h11_response, self)

This approach still requires to have a pre-compiler, but now a pre-compiler is only required to produce async code and is completely optional. It is similar to what @ncoghlan proposed here earlier, but requires no new sub-syntax in comments, instead relying on simple function calls and decorators.

@haikuginger

but to file a PEP or create a library that adjusts the Python syntax or provides additional faculties to declare the isomorphism.

If we create a library to handle this transition for urllib3, I'll use it in my asyncpg project to make a synchronous version of it.

@ncoghlan
Copy link

@njsmith My simple example pragmas would require that all await expression be either standalone expression statements or else the RHS of an assignment statement, so your print example would need to be written as:

value = get_value() # add-await
print(value)

That said, I do agree that having the async version be the "source of truth" is a better option, specifically due to the fact that it avoids the question of "Where should the await go?".

It also occurs to me that @haikuginger's code generation concerns could potentially be mitigated by using an approach similar to the one CPython uses for many of its generated files: check them in to source control, and then have CI complain if running the code generator changes the contents of the generated files.

We do that in CPython so that you don't typically need to bootstrap and run _freeze_importlib or Argument Clinic to get a working Python tree - you can if you want or need to, but you can also just run a more typical compilation toolchain over our included C files instead.

@smurfix
Copy link

smurfix commented Feb 22, 2018

Hmm. The "where should the await go" question could be solved easily:

class _await(object):
    def __pow__(self, x):
        return x
await=_await(); del _await

Thus python 2 compatible code

#@async
def foo(x):
    return 42 + await** bar()

would easily and bidirectionally(!) transform into an async version without further mangling.

Bidirectionality is important. When working on an async bug I don't want to have to replicate my fixes into the sync version. That's tedious and error prone.

@haikuginger
Copy link
Contributor

haikuginger commented Feb 22, 2018

@smurfix makes an extremely interesting syntax suggestion which could form the basis for a code transformation process I'd be comfortable with. Doing it this way could create an extremely distinctive token for replacement while preserving the synchronous API as a first-class citizen.

Combined with @ncoghlan's suggestion of having CI autofail if codegen modifies anything, and a bidirectional toolchain that detects which version has been modified and propagates the modification to the other version, I think we're getting somewhere.

I do think we wouldn't want to necessarily call our placeholder token await, and we might want to use a more exotic operator than **. Maybe something like one of these:

return 42 + (resolve_potential_coroutine <= bar())
return 42 + (finalize << bar())
return 42 + (complete | bar())

@kennethreitz
Copy link

So, I'm not sure if Requests' plans influence any decisions here or not, but the tentative plans for Requests 3.0 are:

  • Python 3 only.
  • Support for async/await (via the work being done here)
  • Type annotations.

Just wanted to throw that out here. Potentially not supporting 2 would make this work a lot easier.

@smurfix
Copy link

smurfix commented Feb 22, 2018

@haikuginger ** has the advantage that (a) it's right associative and (b) it binds (almost) as tightly as "await". 12+await+foo()*10 would not work. The power operator does work even if you do something silly like 2**await**foo()**2, without requiring parentheses.

Using ** we'd be able to do a purely mechanical translation: replace await with await ** and replace <indent>async with <indent>#@async\n<indent>, or vice versa, which IMHO is more attractive than requiring some more complicated tool that would need to correctly emplace parentheses when going the py3>py2 way.

I also agree that we should use something other than "await", for the simple reason that the sync version would also work with Python3. Personally I'd just go with await_**. I'd simply create a vim macro that autoreplaces any await with await_** (ditto for async/#@async) while I'm editing the Py2 version of the code and I won't even have to change my typing habits, or remember what the heck this strange (finalize | foo()) thing is supposed to mean.

@ncoghlan
Copy link

Using await_** as a no-op prefix operator to mark blocking operations in the synchronous code is a neat idea.

The #@async marker might look a little odd in the with and for statement case, though. Since it's a code transformation directive, one possibility would be to use #=> async to indicate "make this async when converting it", giving:

#=> async
def foo(cm):
    #=> async
    with cm:
        return 42 + await_** bar()

(Whether the transformation marker looks better on the preceding line or as a trailing comment would presumably become clearer once some of the code had been suitably marked up - the trailing comment would require that the code transformer be a bit more clever, but it would still only involve looking ahead to the end of the current line)

@pquentin
Copy link
Member

I'm glad to see out of the box thinking! I would never have thought about those options. await_** is an interesting hack. But it still has issues:

  • it's also an awkward syntax that nobody knows
  • you need to import await_ before using it
  • tooling is ready for async/await, but not for await_** - two examples:
    • if I forget one * it won't be a syntax error,
    • flake8 won't like #=> async (E265 block comment should start with '# ') and await_** bar() (E225 missing whitespace around operator)
  • you still need to transform the code before you use it, even in sync mode because you don't want the fake await to show up in your tracebacks, so users won't use the canonical source tree either

@smurfix
Copy link

smurfix commented Feb 23, 2018

@pquentin Well …

  • probably, but you get used to it.
  • true that, but a one-time effort (and you need to set up a bunch of other imports anyway)
    • also cookiecutter et al. can do it for you
  • fixable once we agree to go forward with this idea
    • no, but a runtime error. Also, we can teach tooling to catch it
    • fixable
  • ?? The fake await will not show up in any tracebacks. Why should it?
    It's two additional return keypresses when single stepping out of a function and pdb can be taught to auto-skip these so you won't even notice.
    If you mean, it'll show up in the tracebacks' source code lines, well, that's the source code so that's what shows up – tracebacks are meant for developers, not users.

@pquentin
Copy link
Member

For the last point, you're right, it looks like I don't fully understand what your _await** proposal involves.

@njsmith
Copy link
Contributor Author

njsmith commented Feb 24, 2018

@kennethreitz It turns out that not supporting py2 doesn't help as much as you might think. If we only care about py35+, then it's possible for the sync and async versions to share their internal implementation. But, the whole public API still needs to be duplicated, which is a job for a computer rather than a human, and once you've got tools to do that, supporting py2 isn't much harder...

@njsmith
Copy link
Contributor Author

njsmith commented Feb 24, 2018

Lots of interesting ideas here!

It sounds like there are two orthogonal decisions. One is whether to write async and generate sync or write sync and generate async -- I'm going to call these "a2s" and "s2a" respectively so my fingers don't get tired. The other decision is whether to check the generated code into git.

use async to generate sync (a2s), or vice-versa (s2a)?

I think the s2a proposals above could work, though there'd need to be some more stuff added, like mirror.StopIteration or mirror.next. So which is better?

AFAICT, the main advantage of making the sync code the source of truth (s2a) is that someone could do a git clone, adjust their PYTHONPATH, and immediately import and use the synchronous code without having to do anything with pip or setup.py.

For contributors modifying the code, I don't think this choice affects the workflow. Either way, the test suite will want to test both versions of the code, so the workflow will be (1) modify canonical version, (2) regenerate the other version, (3) run the tests. (And probably in practice there'll be a script that takes care of doing regeneration + testing with a single button press.)

There are also some downsides to s2a though:

  • We have to invent a whole new language to express all the async annotations inside regular synchronous Python code. Inventing languages is always annoying -- notice that we're already starting to bikeshed stuff up above, like the debate about whether the magic async marker should be #=>async or not. In the a2s approach we don't have to do this, because our source language is just Python.

  • We have to build a code translation framework that understands our new input language. AFAICT the languages we're talking about here will require a substantially more complicated generation pipeline, with actual syntactic analysis + comment sensitivity. I think in practice this means we'd have to use lib2to3 and set up some custom syntactic rewrite passes? In the a2s approach we can get away with a very simple token-based processor, that doesn't have to parse Python syntax at all.

  • With the a2s approach, contributors need to understand both sync and async Python (or at least well enough to fake it). With the s2a approach, contributors will need to understand both sync Python and async Python (so they can make sure the code generation process is working correctly and debug mistakes), and they'll also need to learn our ad hoc language for adding async annotations to sync code.

Should generated code be checked in to git?

Again, both options are perfectly doable. Which is better?

Again, the main thing this would get us is that it would allow users who want to do a git clone, adjust their PYTHONPATH, and import the library, without going through pip or setup.py. This is very similar to s2a above, except note that s2a only gives this benefit for people who want to use the sync mode, while checking in the generated code gives this benefit for the async mode as well.

For contributors modifying the code, and reviewers, this adds some extra hassle: first, you have to remember to regenerate before committing. I guess this isn't a huge deal because you have to regenerate before testing anyway, and hopefully you test before committing, right? But it does mess up some workflows, e.g. if you're doing line-by-line staging to split up a change into self-contained commits. It also adds clutter to reviews -- we can check that things were regenerated properly in CI, so the reviewers don't have to actually look at the generated code, but it'll still show up in diffs.

Evaluation

Using both s2a and checking in generated artifacts definitely doesn't make sense, because their advantages are basically the same but their disadvantages are different, so you end up paying twice but only getting one benefit.

If we have to pick between s2a and checking in generated artifacts, checking in generated artifacts seems better to me: it's straightforward and easy to understand.

I am very dubious that the 'git clone and don't build' use case is important enough to justify any of these approaches. Both of these proposals add extra ongoing hassle for contributors, in order to save a small amount of hassle for a small minority of users who are doing something that arguably shouldn't be supported in the first place, and honestly we need contributors more than we need users. But the decision here is ultimately up to the urllib3 maintainers.

One more thing...

There's another proposal above that I didn't talk about yet: storing both versions, and having them both be the source of truth. So it's basically s2a and a2s and checking in the source, all at once. This sounds really unpleasant to me: basically it has all the disadvantages of all of these approaches, plus new ones like: if there's a discrepancy between the two versions in some tree, how does the tooling figure out which side should be regenerated? This feels way too confusing and over-engineered IMO.

Also, if I'm reading right, the reason @smurfix proposed bidirectionality is because he doesn't want to deal with an ad hoc encoding of async annotations into synchronous python, not that he actually wants bidirectionality per se :-).

@smurfix
Copy link

smurfix commented Feb 24, 2018

Well … partly. I am proposing bidirectionality (i.e.. either version may serve as the "source") because that's the way I work. "Hack, save source, go to other window, run testcase", no matter whether I'm working on the sync or async version. No converter involved. And let's face it, pdb'ing the sync side is easier – at the very least, you don't have to deal with other threads doing interesting things behind your back every time you step across an "await". It's also too easy – if you need to find a bug that only shows up when running async. ;-)

Tooling can mechanically generate the "other" version based on which version has been modified. If the answer is "both", it should check that translating the older file matches the newer one, else complain.

Obviously, this means I'd check in both versions. Usually I don't like that at all, but in this case it'd make enough sense IMHO. Otherwise I'm strongly in favor of keeping the async version as the True Source, if only because removing async annotations is way easier than adding them.

@dstufft
Copy link
Contributor

dstufft commented Apr 5, 2018

It's not up to PyPI, PyPI just presents a list of files and related information about them and the installer picks which one it installs. For modern versions of pip, it would get the most recent version whose python_requires matched.

Although I feel like this is a bit of a digression, because @njsmith has already stated that maintaining Python 2.x support isn't particularly hard if you have sync support, the hard/sticky part is maintaining sync support.

@njsmith
Copy link
Contributor Author

njsmith commented Apr 6, 2018

@dstufft

I think being async first, with sync wrappers is the only real sane way of doing this. [...] I suspect it's also a fair bit slower in the sync case, since CPython has a pretty high overhead per function call

The way you wrote this makes me uncertain whether you understood the trick that we're using here :-). There are no sync wrappers; instead, we basically treat the async code as a DSL for generating the sync code. This is really easy, because for everything except the lowest-level I/O functions, the async code is already just the sync code + async/await annotations; stripping out those annotations gives us a sync version of the library. So there's no need to maintain any kind of wrapping layer, and there are no extra functions calls.

The nice thing about this strategy is that it's totally generic; once the async→sync script is debugged, you can take any library that's built on top of urllib3 and turn it into a sync+async library. All you have to do is add the async/await annotations at the appropriate places in your code and then add the async→sync plugin to your setup.py.

Since you're here and working on boto these days, let me ask: do you think this is something you'd be interested in using in boto? I know @brettcannon is eager to get async support into the Azure SDK...

@dstufft
Copy link
Contributor

dstufft commented Apr 6, 2018

The way you wrote this makes me uncertain whether you understood the trick that we're using here :-).

Nah I grok'd it, I just worded it poorly since I don't have a word to describe what it is.

Since you're here and working on boto these days, let me ask: do you think this is something you'd be interested in using in boto?

Possibly!

@ncoghlan
Copy link

ncoghlan commented Apr 6, 2018

While I suspect you're not going to need it (given the way the rest of the discussion is going), I'll note that https://packaging.python.org/guides/dropping-older-python-versions/ exists now to cover the ins-and-outs of using Python-Requires to deliver older maintenance releases to users on older Python versions.

As far as what to call the synthetic synchronous API in Nathaniel's proposal goes, perhaps "synchronous shadow API" would work?

theacodes pushed a commit that referenced this issue Apr 20, 2018
* Implement wait_for_[read,write} directly on top of 'select' module

Our vendored backport of the 'selectors' module is huge, its tests are
flaky, and the only we thing we use it for is to... turn around and
implement some trivial select() operations. It lets urllib3 use epoll
or kqueue... but the way urllib3 uses them, they're actually *less*
efficient than just using poll or select.

This commit removes the dependency on 'selectors', by implementing
urllib3's wait_for_{read,write} helpers directly on top of
poll/select. Because I'm sneaky, it does this in terms of a more
generic wait_for_socket(...) operation, which is exactly the primitive
that we need for the bleach-spike branch (see gh-1323), so this should
also help keep the diff down.

* Delete selectors backport

Now that we're not using it anymore (see previous commit), it can be deleted.

* Address comments from review

- Check that poll works before using it
- Add a test for detecting readability after the peer closes the
  socket
- Remove an unnecessary coverage-control comment

* Delay choice of wait_for_socket backend until first call

To work around things like:

    import urllib3
    from gevent import monkey
    monkey.patch_all()

Since this means we no longer make an import-time decision about
wait_for_socket support, I also got rid of the constant
HAS_WAIT_FOR_SOCKET.

* Add test for calling wait_for_socket on a closed socket

* Fix lint errors
@jmehnle
Copy link

jmehnle commented Jun 11, 2018

Given how hard it is to make substantial changes to urllib3 and given that (a) the current urllib3 and (b) an async-enabled version of it would be targeting two almost completely disjoint audiences, I'd consider any solution that involves DSLs or automatic code generation solely to support old Python versions a path into a decade of pain and suffering. As a complete outsider to urllib3 development (but long-time Python developer) it seems to me the best way forward would be to fork urllib3 to urllib4 and make that support only Python 3.5+.

Assuming this as a given, we do want to support both sync and async use cases. Having done a lot of thinking on async Python lately, to me, conceptually, await simply serves to mark where cooperative context switches are allowed to occur. As such, there ought to be a way to write urllib4 using async / await pervasively, and then provide a synchronous API by transparently wrapping the async library in a trivial event "loop" (in asyncio terms) or "kernel" (in curio terms) to simply execute an async method with no context switching.

@pquentin
Copy link
Member

@jmehnle The trivial event loop would look like run_secretly_sync_async_fn, right? As discussed a few comments later in that link, doing this for the whole urllib3 API is quite a lot of work, and we already get this for free if we want Python 2 support. But if you drop Python 2 support, then it's certainly an option. As noted by @shazow, it's not clear that we'll be able to finish this before Python 2 EOL anyway.

@jmehnle
Copy link

jmehnle commented Jun 11, 2018

Re trivial event loop: Yeah, pretty much. I think the synchronous shim layer could be generated at run-time using meta-programming techniques.

@rspadim
Copy link

rspadim commented Jun 18, 2020

why not make everything at backend async, and provide code as async and sync to final user, where sync is a wrapper to get loop and run until complete?

@sethmlarson
Copy link
Member

@rspadim That has been tried in HTTPX and determined that it's not feasible. I can't recall why exactly but I'm sure you could dig up the reasoning in their issue tracker.

@pquentin
Copy link
Member

encode/httpx#508

Dobatymo pushed a commit to Dobatymo/urllib3 that referenced this issue Mar 16, 2022
* Implement wait_for_[read,write} directly on top of 'select' module

Our vendored backport of the 'selectors' module is huge, its tests are
flaky, and the only we thing we use it for is to... turn around and
implement some trivial select() operations. It lets urllib3 use epoll
or kqueue... but the way urllib3 uses them, they're actually *less*
efficient than just using poll or select.

This commit removes the dependency on 'selectors', by implementing
urllib3's wait_for_{read,write} helpers directly on top of
poll/select. Because I'm sneaky, it does this in terms of a more
generic wait_for_socket(...) operation, which is exactly the primitive
that we need for the bleach-spike branch (see urllib3gh-1323), so this should
also help keep the diff down.

* Delete selectors backport

Now that we're not using it anymore (see previous commit), it can be deleted.

* Address comments from review

- Check that poll works before using it
- Add a test for detecting readability after the peer closes the
  socket
- Remove an unnecessary coverage-control comment

* Delay choice of wait_for_socket backend until first call

To work around things like:

    import urllib3
    from gevent import monkey
    monkey.patch_all()

Since this means we no longer make an import-time decision about
wait_for_socket support, I also got rid of the constant
HAS_WAIT_FOR_SOCKET.

* Add test for calling wait_for_socket on a closed socket

* Fix lint errors
@ghost
Copy link

ghost commented Apr 14, 2022

Curious to know if this has been discussed in the last few years? Is this something that will get added to urllib3 or is it likely to wait for a urllib4 or something similar?

@pquentin
Copy link
Member

pquentin commented Apr 14, 2022

I've been meaning to close this issue for a while now. To be clear, the https://github.com/python-trio/hip effort had made great progress:

  • all sync tests were passing
  • we could make async requests
  • we had figured out how to run each test in sync and async (trio, anyio, asyncio)

But then making each test to pass in async was going to be yet another slog, and after that I would have to tackle requests. It was becoming lonely and urllib3 was continuing to move fast. So the hip has stalled, and I'm now focusing my limited time into urllib3 itself. The goal is to (very) slowly move towards h11 support in urllib3, and maybe eventually async support. We've learned various lessons with hip, and will apply them to urllib3 in time. The first step here is to decouple us from the Python standard library HTTP support in urllib3 2.0: #1985.

Closing the issue as there's nothing actionable left here. Thanks everyone for the support!

edit: if you want an async client today, try httpx (it supports trio and asyncio) or aiohttp (asyncio only)

@idan3
Copy link

idan3 commented Apr 17, 2024

Today, Can urllib3 support async?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests