Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2.0 Wishlist #776

Open
shazow opened this issue Jan 5, 2016 · 26 comments
Open

v2.0 Wishlist #776

shazow opened this issue Jan 5, 2016 · 26 comments

Comments

@shazow
Copy link
Member

shazow commented Jan 5, 2016

Not sure we'll do this, but if we ever get a v2.0 release, some major changes that would be nice to see:

  • Deprecate/remove outdated APIs
    • pool.urlopen(...) (in favour of consuming a pre-configured request object; the rest of the RequestMethod helpers will probably remain backwards-compatible)
    • response.data (in favour of .read())
    • Default response.read(preload_content=False) Make preload_content=False the new default? #436
    • (What else am I forgetting?)
  • More context managers as a default API! (To allow block-based closing/release, and possibly async-ness) Support with-statement on response objects to improve protection against connection leaking #809
  • Request history
    • We'll need a pre-configurable request object (instead of using a few dozen keyword params).
  • Autoconfig for AppEngine/Proxy settings/SSL (certifi)/others?
    • We'll need a Session-like class in front of PoolManager that is more implicit. Ideally any new feature we add which can get auto-configured based on env should happen magically here.
  • More configurable security?
    • I forget what we were doing for this, I remember we were working on some SSLContext abstraction?
  • No exposed dict registries, switch to callables. (Related: Make each PoolManager get a copy of pool_classes_by_scheme #828)
  • Return proper types on .read() (should this be bytes? or header-sensitive encoding?)
  • Unlikely long shots:

TODO: Cross-link existing issue refs.

@shazow shazow added the Someday label Jan 5, 2016
@shazow shazow added this to the v2.0 milestone Jan 5, 2016
@shazow
Copy link
Member Author

shazow commented Jan 5, 2016

Any other ideas/requests? @Lukasa @sigmavirus24 @dstufft @jonparrott @t-8ch @kevinburke (feel free to cc others for feedback)

Is there any big change that would make HTTP/2 support easier?

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Jan 5, 2016

I think for my part I'd like a declarative TLS API: something that expresses what you want in abstract terms and maps it to a real configuration.

How about, if we're going really "pie in the sky", I add: asynchronous by default? ;)

@shazow
Copy link
Member Author

shazow commented Jan 5, 2016

How about, if we're going really "pie in the sky", I add: asynchronous by default? ;)

I'll add to the long-shot list. Any ideas on what would need to change for this to be possible/what it would look like?

@ml31415
Copy link
Contributor

ml31415 commented Jan 5, 2016

About this httplib replacement. As I had a look into this lately, I'd say replacing it wouldn't be too complicated. In the end, it's only three classes in there. As a starter, it would be enough to copy the imported parts from httplib, make it compatible with six, and then start simplifying and refactoring under the hood, step by step. I saw the header management has some overlap and could be simplifyed. You probably find some other spots. There is also a bunch of weird stuff from ancient python times in there, like using a dict without setting values, instead of a set etc.

All in all, the httplib replacement would sound more like a WIP thingy for me, as I guess it wouldn't break any public API. The question remains why? In the end httplib is in the standard library. So even if it's a bit ugly itself, I'm not sure if it's worth reinventing the wheel.

@shazow
Copy link
Member Author

shazow commented Jan 5, 2016

@ml31415 There is tons of code in urllib3 working around httplib's shortcomings, things like unmangling headers and bugs in the state machine and others. Especially differences in the py2 vs py3 versions. I wouldn't want to necessarily fork/rewrite httplib as a vendored lib, but moreso switch to a different implementation of the parser/state machine (maybe even @Lukasa's hyper, when it's ready).

@theacodes
Copy link
Member

We'll need a Session-like class in front of PoolManager that is more implicit.

I like this but I don't have a strong need. Personally, I would just like a better name for the top-level class PoolManager seems a bit .. weird, especially when some Managers (like ProxyManager and AppEngineManager) don't actually do any managing. But then again this is a low-level library. 🤷

@shazow
Copy link
Member Author

shazow commented Jan 5, 2016

Right, this would be a higher-level frontend for the other lower-level components. Maybe feature a kind of registry that external plugins can provide additional services for.

@theacodes
Copy link
Member

Maybe feature a kind of registry that external plugins can provide additional services for.

Sounds reasonable.

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Jan 6, 2016

Any ideas on what would need to change for this to be possible/what it would look like?

Well, we'd definitely need to move away from httplib in the first instance.

I'd need to talk to @hawkowl because I know she has thoughts in this area, but it ought to be possible to write urllib3's protocol layer (the httplib replacement, plus some other bits) as non-IO-doing synchronous code, and then have the outer layer talk to sockets. That would let us have a series of manager classes:

  • PoolManager (implicitly synchronous but thread-safe)
  • TwistedPoolManager
  • AsyncioPoolManager

In this model, I think the only things that would vary for each PoolManager would be:

  • how they do IO (standard would use sockets, Twisted would use Twisted Transports, Asyncio would something something protocols)
  • how they pool connections

Basically everything else would be the same.

I'm kinda interested in this idea, but to pull it off I think I need to write something like hyper-h2 but for HTTP/1.1, unless we can find a good off-the-shelf version.

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Jan 6, 2016

That said, a very basic hyper-h2-alike for HTTP/1.1 shouldn't be that hard to write if we just want to spike it out and see how it looks.

@shazow
Copy link
Member Author

shazow commented Jan 12, 2016

That said, a very basic hyper-h2-alike for HTTP/1.1 shouldn't be that hard to write if we just want to spike it out and see how it looks.

+1, I too suspect it might be less work than it seems but more than I'd want to commit to right now. Certainly way less work than doing HTTP/2.0.

@anatasiajp
Copy link

I know that this feature may be a little bit weird but can we have HTTP 0.9 support for Python 3+, Python's developers completely removed HTTP 0.9 from Python 3+, but in fact HTTP 0.9 is still useful in some cases. Like internet router, some model needs HTTP 0.9 to login to admincp.

@sigmavirus24
Copy link
Contributor

I don't think that makes sense for the scope of urllib3 @cattleyavns

@shazow
Copy link
Member Author

shazow commented Jan 13, 2016

@sigmavirus24 How come? If we're going to switch out to our own http/1.1 parser, adding 1.0 and 0.9 support should be relatively easy in comparison (largely a subset with minor tweaks iirc).

@anatasiajp
Copy link

I think we should have a new feature for Retry+Timeout, for example if download speed is too low, we should restart the download progress, because in real world example, for example we try to download a Youtube video from Googlevideo.com, sometimes download speed will be slow, but if we simply stop download then download again, we will have a better download speed.

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Jan 18, 2016

@cattleyavns Specifically I think the problem here is that we have no automated retry for streaming downloads. That's a much harder thing to build.

@ml31415
Copy link
Contributor

ml31415 commented Feb 2, 2016

Another wishlist thing. Not sure how you see this, but imho unittest is a major pain in the ass. Clunky syntax, no way to properly parametrize tests, inflexible, unpythonic, ... pytest is so much more pleasant to work with and write tests, I'd love to see that in urllib3, too. The conversion could also be done step by step, as pytest supports unittest syntax, which works quite well.

@shazow
Copy link
Member Author

shazow commented Feb 2, 2016

@ml31415 +0 to migrating to pytest, though I don't think we need to wait for 2.0 for this. It's not user-facing so it could be done at any point.

@ml31415
Copy link
Contributor

ml31415 commented Feb 2, 2016

Agreed, but it's a major change and piece of work. The main question is: Is it desired at all by the core devs?

@shazow
Copy link
Member Author

shazow commented Feb 2, 2016

@ml31415 It's definitely meaty but, as you said, could be done incrementally.

I'm all for improving/cleaning up our tests, but it's not really blocking anything so it's not on any of the core devs' plates (hence my +0 vote, aka nice-to-have). If it's something that you're interested in attempting, I think I speak for everyone when I say we very much welcome it. :)

@ml31415
Copy link
Contributor

ml31415 commented Feb 3, 2016

Ok, fair enough. In that case, I may actually give this a shot somewhen. I was mainly afraid, the whole idea may be rejected, due to adding further dependencies or smth.

@brettcannon
Copy link

I commented on Twiter how I would love to see an HTTP response ABC to help standardize libraries. @Lukasa then said people should comment here on the topic, so here I am. 😄

So whatever cleaned-up API is chosen, it would be great if an ABC is available to help make sure various projects follow a common API where it makes sense (I realize there will be sync/async differences when it comes to certain things, but those could actually be specific ABC subclasses to provide those solutions to deal with any Python 2/3 divide).

@yan12125
Copy link

Hello, is there any further plan or progress on this topic? I'm looking forward to an urllib3 not depending on Python's built-in http.client so that more features are possible. An example is requests/toolbelt#136.

@haikuginger
Copy link
Contributor

@yan12125, I think a lot of the people who work on the project have had other commitments so work has somewhat slowed down a bit. Obviously we all want to get there, but unless I'm mistaken, we don't currently have anyone on the team working on OSS full-time like we've been lucky enough to have in the past.

@bluetech
Copy link
Member

bluetech commented Apr 15, 2023

In 2016 shazow wrote (emphasis mine):

ml31415 There is tons of code in urllib3 working around httplib's shortcomings, things like unmangling headers and bugs in the state machine and others. Especially differences in the py2 vs py3 versions. I wouldn't want to necessarily fork/rewrite httplib as a vendored lib, but moreso switch to a different implementation of the parser/state machine (maybe even Lukasa's hyper, when it's ready).

I figure http.client is not going to be replaced for 2.0. But maybe it makes sense to reconsider the idea of copying http.client over to urllib3, instead of using the stdlib? Then it can be slowly customized and fixed up in urllib3, and leaves urllib3 2.x with full control over its fate. Is there anything preventing urllib3 from doing this?

@pquentin
Copy link
Member

@bluetech Thanks, this is an interesting idea! However I think it goes in the wrong direction as @shazow explained back in 2016. We don't currently have big complaints about http.client (the new name of httplib). The urllib3 main branch is Python 3.7+ so Python 2 compatibility is not a problem anymore and I'm not aware of recent bugs that caused us pain. (By contrast, the ssl module has been a constant source of churn over the last few years.)

However, http.client does not support HTTP/2 or HTTP/3 and cannot be used async, so we'll need a replacement if urllib3 wants to support those features and we will want to reuse existing libraries like h11 or h2 to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests