Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support --user flag from pip #2077

Open
potiuk opened this issue Feb 29, 2024 · 21 comments · May be fixed by #2352
Open

Support --user flag from pip #2077

potiuk opened this issue Feb 29, 2024 · 21 comments · May be fixed by #2352
Assignees
Labels
compatibility Compatibility with another interface e.g. `pip` great writeup A wonderful example of a quality contribution 💜

Comments

@potiuk
Copy link

potiuk commented Feb 29, 2024

Currently (even in latest uv 0.1.12 that supports --python flag) it's not possible to use uv in case --user flag of pip (or PIP_USER="true" env variable is set).

With --user flag, packages are installed to a ~/.local folder, which means that the user might not have access to the python system installation as a whole, but can locally install and uninstall packages, without having venv.

This is extremely useful for cases like CI and building container images, where having a venv is an extra overhead and it is unnecessary burden.

There is an interesting (and used in Apache Airflow) case for the --user flag (this currently prevent us from using uv by us and our users also in PROD images in addition to our CI image). That is currently a blocker for apache/airflow#37785

Why --user flag is useful and why we use it in Airflow?

Using --user flag is pretty useful when you want to have an optimized image - because such .local folders can be copied between different stage of the image (based on the same base image and prod libraries installed) and you can copy the whole .local folder between the stages after packages are installed - which means that the final stage of the image does not have to have build-essentials/ compilers installed.

You could do the same with venv if you keep it in the same folder, but that looses an essential capability of creating venv dynamically containing all the packages you have in your .local folder. When you have your --user installed packages, and you create a new venv with --system-site-packages, the packages installed in .local folder are also installed in the new venv. This does not work when you create a new venv from another venv because --system-site-packages are only the ones installed in the system.

While I can think of some creative ways (maybe I will find some) - having an equivalent of --user installation by uv pip install would be a great simplification for our case to support the users who want to use uv (and seems that there is already a need for that looking at the apache/airflow#37785

@charliermarsh charliermarsh self-assigned this Feb 29, 2024
@charliermarsh charliermarsh added compatibility Compatibility with another interface e.g. `pip` great writeup A wonderful example of a quality contribution 💜 labels Feb 29, 2024
@charliermarsh
Copy link
Member

Thanks! I really appreciate the clear write-up and motivation here. (I'm hoping to get to this today, it's been requested a few times.)

@potiuk
Copy link
Author

potiuk commented Feb 29, 2024

Just to take off the pressure a bit. I figured out how to get rid of the --user flag (still have a problem to fix) apache/airflow#37796 - I attempted it quite some time ago and failed but this time I got a brilliant idea - I simply deleted the ~/.local folder and created a new ~/.local venv, added VIRTUAL_ENV=~/.local env var and added ~/.local/bin to PATH and .... voila ... It works 100% compatible with --user flag it seems - all the cases I had (including pip -m venv --system-site-packages) work like a charm.

So ... It seems I do not need it that much any more .... I will make Airflow PROD image also uv friendly now :) (128s instead of 280s is the gain I have :). Not bad - another 55% improvement.

There is a small difference though with venv when using python -m venv and uv venv. With uv you do not have --system-site-packages option yet. That might be the more useful of the two.

After that experience I have a new thought. The --user flag has a few bad side effects (that's why I am glad to finally get rid of it). One of the problems is that you cannot run --user and --editable any more with pip. So maybe - just maybe (?), rather than implementing --user flag - documenting how to get an equivalent by creating ~/.local venv is a better option ?

@charliermarsh
Copy link
Member

@potiuk - At the very least I can add --system-site-packages, that's really easy. I'm somewhat undecided on --user... Gonna sleep on it.

@potiuk
Copy link
Author

potiuk commented Feb 29, 2024

@potiuk - At the very least I can add --system-site-packages, that's really easy. I'm somewhat undecided on --user... Gonna sleep on it.

Yep. That would be a good start :). I think pip maintainers would gladly drop that one (--user) . So my proposal for the uv team is think very deeply on whatever you add. You are now way faster on adding things and responding to new requests - comparing to pip - but mainly because you do not have the whole baggage that pip accumulated over the years and where adding every single small change will make some loud part of your users unhappy. I think it should be a very conscious decision to add somethig that you will have to maintain in the future.

@charliermarsh
Copy link
Member

Yeah I strongly agree with this. Very good callout. It's nice to add things that unlock user workflows, but there are some areas where we actively want to change user behavior, and we need to hold firm on some of those lines.

@charliermarsh
Copy link
Member

Can I ask why you use --system-site-packages here?

@potiuk
Copy link
Author

potiuk commented Feb 29, 2024

Yes. In Airflow we have something called PythonVirtualenvOperator. Airflow has operators (4000+ of them 😱 ) that can do tasks related to some "stuff" to do - some of them are specific ("GoogleCloudJobOperator, ApacheSparkJobOperator, CreateEKSClusterOperator) but we have a number of a few generic ones BashOperator, PythonOperator, ExternalPythonOperator` - runs python code on prepared virtualenv. All of those are unaffected, but PythonVirtualenvOperator is special - because what it does is:

It creates a new virtualenv based on requirements, python version. One of the options (only valid if your python version matches the system version) is system-site-packages - when True, the new venv will have airflow and all airflow packages pre-installed.

This is super important, because we serialize arguments that we pass to the operator, and de-serialze return value from it, as interface between airflow and the "new venv" method executed. And in order to serialize some objects, the target venv should have the "airflow + often other packages" installed. Moreover the python code to be executed (basically a methpd) can do some local imports - expecting airflow or other packages to be installed.

The --system-site-packages in this case is best, because you do not have to explicitly specify which airflow version or which other packages you have to have installed. You can specify extra dependencies to add - but having the base same as the Airflow execution environment is often necessary.

Here howto: https://airflow.apache.org/docs/apache-airflow/stable/howto/operator/python.html#pythonvirtualenvoperator
Here Python API: https://airflow.apache.org/docs/apache-airflow/stable/_api/airflow/operators/python/index.html#airflow.operators.python.PythonVirtualenvOperator

@potiuk
Copy link
Author

potiuk commented Feb 29, 2024

Under the hood, the operator does:

def _generate_virtualenv_cmd(tmp_dir: str, python_bin: str, system_site_packages: bool) -> list[str]:
    cmd = [sys.executable, "-m", "virtualenv", tmp_dir]
    if system_site_packages:
        cmd.append("--system-site-packages")
    if python_bin is not None:
        cmd.append(f"--python={python_bin}")
    return cmd

@potiuk
Copy link
Author

potiuk commented Feb 29, 2024

The way how it works now (just built and tested a new image):

With system-site packages:

airflow@a249d829a411:/opt/airflow$ python -m virtualenv --system-site-packages ~/.aaaa
created virtual environment CPython3.10.13.final.0-64 in 422ms
  creator CPython3Posix(dest=/home/airflow/.aaaa, clear=False, no_vcs_ignore=False, global=True)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/airflow/.local/share/virtualenv)
    added seed packages: pip==24.0, setuptools==69.1.0, wheel==0.42.0
  activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator
airflow@a249d829a411:/opt/airflow$ ~/.aaaa/bin/python -m pip freeze
adal==1.2.7
adlfs==2024.2.0
aiobotocore==2.12.0
aiofiles==23.2.1
aiohttp==3.9.3
aioitertools==0.11.0
aiosignal==1.3.1
alembic==1.13.1
amqp==5.2.0
anyio==4.3.0
apache-airflow @ file:///docker-context-files/apache_airflow-2.9.0.dev0-py3-none-any.whl
apache-airflow-providers-amazon @ file:///docker-context-files/apache_airflow_providers_amazon-8.18.0.dev0-py3-none-any.whl
apache-airflow-providers-celery @ file:///docker-context-files/apache_airflow_providers_celery-3.6.0.dev0-py3-none-any.whl
apache-airflow-providers-cncf-kubernetes @ file:///docker-context-files/apache_airflow_providers_cncf_kubernetes-8.0.0.dev0-py3-none-any.whl
apache-airflow-providers-common-io @ file:///docker-context-files/apache_airflow_providers_common_io-1.3.0.dev0-py3-none-any.whl
.....  AND 300+ other packages.

Without:

airflow@a249d829a411:/opt/airflow$ python -m virtualenv  ~/.bbbb
created virtual environment CPython3.10.13.final.0-64 in 113ms
  creator CPython3Posix(dest=/home/airflow/.bbbb, clear=False, no_vcs_ignore=False, global=False)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/airflow/.local/share/virtualenv)
    added seed packages: pip==24.0, setuptools==69.1.0, wheel==0.42.0
  activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator
airflow@a249d829a411:/opt/airflow$ ~/.bbbb/bin/python -m pip freeze
airflow@a249d829a411:/opt/airflow$.

@charliermarsh
Copy link
Member

The --user flag has a few bad side effects (that's why I am glad to finally get rid of it). One of the problems is that you cannot run --user and --editable any more with pip. So maybe - just maybe (?), rather than implementing --user flag - documenting how to get an equivalent by creating ~/.local venv is a better option ?

@potiuk -- I'm trying to make a decision on whether to support --user. Do you have any references you could share for these side effects in pip? 🙏

@potiuk
Copy link
Author

potiuk commented Mar 12, 2024

@potiuk -- I'm trying to make a decision on whether to support --user. Do you have any references you could share for these side effects in pip? 🙏

The starting point I have is this:

image

And I think there were lots of related discussions:

Here is one: pypa/pip#6375
And here is the PR that added the message above: pypa/pip#6370

Looong discussions. But basically the gist of it is that --editable and --user do not play well together in the light of what has been agreed for PEP 517.

Personally I consider --user feature of pip as should be depreacated and besides the historical logic and the behaviour of making the .local venv automatically available for any venv, it has very little use, if you consider that you could create .local as a venv. I think it was introduced in https://peps.python.org/pep-0370/ and there were some discussions on deprecating it,

But I am not fully aware about some history behind .local and even less about future of it to make some "good" advice on it - treat it with a pinch of salt.

We luckily got rid the --user flag (and PIP_USER variable) even if pip version of our container images and Airflow 2.9.0 one (in a month or so) will not have it any more.

@imfing
Copy link

imfing commented Mar 12, 2024

@potiuk Thank you for this issue, and thanks for the inputs you provided

Some of my thoughts regarding the comment above:

The starting point I have is this:

using the --system-site-packages while creating the virtual environment should make this error message gone.
If we take a step back, --user wouldn't be quite useful in this case since there's already an virtualenv.

And I think there were lots of related discussions. Here is one: pypa/pip#6375

I skimmed through it, and it looks like the issue was mainly about --editable install breaking when there's pyproject.toml file.
It doesn't seem like --user flag had anything to do with it, since, according to the original issue: both pip install --user -e . and pip install -e . fail.

From what I understand, PEP 517 itself discusses allowing projects to specify different build systems through pyproject.toml. The build process should be independent from where the packages are installed. Not sure how that would impact user site installation.

Personally I consider --user feature of pip as should be depreacated ... there were some discussions on deprecating it,

--user was introduced over a decade ago in PEP 370.
I also saw this thread about deprecating it. some replies were not in favor of deprecation, for example:

Given that we're working towards making user site-packages the default
install location in pip, removing that feature at the interpreter
level would be rather counterproductive :)

Virtual environments are a useful tool if you're a professional
developer, but for a lot of folks just doing ad hoc personal
scripting, they're more complexity than is needed, and the simple "my
packages" vs "the system's package" split is a better option. (It's
also rather useful for bootstrapping tools like "pipsi" - "pip install
--user pipsi", then "pipsi install" the other commands you want access
to).

Cheers,
Nick.

If we do a quick search on GitHub for the --user flag: https://github.com/search?q=%22pip+install+--user%22&type=code&p=1 we'll see it's still widely used

Though I haven't used --user much lately, thanks to venv and Docker containers, it's still a popular choice for many Python users for installing packages without needing admin privileges in many environments other than CI


Putting --user flag in uv would bring some of the complexity accumulated in pip over the years.

PR #2352 was my preliminary attempt to fit it into the uv pip implementation without introducing too much complexity as the script does the heavy lifting here.

I haven't looked into every combination of how --user flag would play with other pip flags, but it would take some effort to make it compatible with the behavior of pip for sure.

Ultimately, it's up to the uv team to decide whether to support it.

@danielhollas
Copy link

Virtual environments are a useful tool if you're a professional
developer, but for a lot of folks just doing ad hoc personal
scripting, they're more complexity than is needed, and the simple "my
packages" vs "the system's package" split is a better option.

This is an excellent quote, thanks @imfing for noting this. As somebody coming from the scientific Python world, I was meaning to write something along these lines in this thread as well.

Another useful framing of this is the distinction between a python project versus a single-file script. For the latter, having a separate venv for each individual script would not only be annoying but also error prone, as you'd need to be constantly activating / deactivating environments, when often all you need is stdlib and numpy/scipy.

We also have a rather odball use case for this -- we have a very unholy setup where we provide a system python installation in a container image with a lot of pre-installed packages, but we allow users to install additional packages to ~/.local, which is backed by a docker volume so that it is persisted when the container exits. We can't have separate venvs since we have a lot of packages installed in the system environment (e.g. the whole jupyter stack) and they can't be duplicated.

Small side note: If this feature is accepted, I think providing --user ahould also automatically enable --strict. That's what pip does as well, makes a lot of sense in terms of making sure the user knows if they broke themselves.

(all that said, it's brilliant that uv defaults to requiring a venv, that's already a huge change with respect to pip, and forced me personally to start using them, where previously I've used conda + pip, not a great combo. I never quite got myself to understand the various subtleties - venv x virtualenv etc.)

@zanieb
Copy link
Member

zanieb commented Mar 13, 2024

the simple "my packages" vs "the system's package" split is a better option

Note this perspective is something that we could solve with new workflows rather than matching the existing --user interface.

It's worth keeping in mind throughout this discussion that we will be building new workflows on top of the fundamental tooling we've built for pip-compatibility. If we can avoid implementing and maintaining something with complex compatibility concerns we can focus on innovation and comprehensive solutions.

@potiuk
Copy link
Author

potiuk commented Mar 13, 2024

It's worth keeping in mind throughout this discussion that we will be building new workflows on top of the fundamental tooling we've built for pip-compatibility. If we can avoid implementing and maintaining something with complex compatibility concerns we can focus on innovation and comprehensive solutions.

This is also my concern - I think (and Airflow use-case proves that) implementing something like that and having users rely on it should be conscious decision, because it will stay with uv once the versioning and compatibility rules are set (which is inevitable) and the more compatibility concerns and use cases you have the more it slows you down the moment you hit 1.0.0 version (assuming 1.0.0 and some kind of semver-ish approach will be adopted).

And --user is a bit a can of worms when you open it, to be honest - especially that uv also allows to create venvs and have some default reliance on venv being effectively required. Which is eventually a good thing even if few years ago I went into a long (and in hindsight unnecessary and far too heated) debate with pip maintainers over their push to venv being primary and the only valid way of installing python packages (especially in the context of docker containers). It even resulted with this medium article: https://potiuk.com/to-virtualenv-or-not-to-virtualenv-for-docker-this-is-the-question-6f980d753b46

I re-read that article again (it's really funny to read such an article few years later and confront your current views with the views of few-years-younger yourself). Suprisingly (or maybe now), I tend to agree with that old-myself in many of the things I wrote there, but with uv opening not only new chapter, but a new not-yet-written book, maybe there is a good solution that I can propose here?

I think PEP-370 proposal was good, and the reason described above (some people do not care about venv and they want to do things quickly) is very valid (recognising that non-power users want to just get-on with their install is also an argument in my article actually).

But the discovery I made in apache/airflow#37785 that I can simply create a venv in ~/.local made me think that maybe we can connect venv creation and non-power users approach and both eat cake and have it?

What happens currently when you start uv pip install without a venv ? You get an error telling you tha tno virtualenv is detected and possibly that you can use --python to point to a python installation.

Is it good for non power users ? Not at all. It's confusing for first time users who have no knowledge about venv and they have no idea where their python is. Do we want to teach them that? I am not sure. We do not want to teach people all the details about venv etc. if all they want is to instal some package and use it, and even less if they are running uv pip install in their Dockerfile. What we want instead - we want them to USE venv. Even if they have no idea they are doing it.

But wait - we already have a way to create venvs (fast) in uv that we fully control here.... Why don't we ..... create venv for such user? Why don't we ..... create it in ~/.local (or equivalent place on other system) automatically if it is not there (and we can even check if ~/.local/bin is in the path and suggest the user to add it to make use of installed entrypoints. That's also nicely compatible with PEP-370 (which is more about using the ~/.local installation place rather thatn creating it.

That would be my proposal in short:

  • forget the --user flag
  • fallback to creating / using ~/.local venv when no virtualenv and no --python flag is used

@matthew-brett
Copy link

Just to say - I too would be very interested in a mechanism to support beginners who do not yet know about virtualenvs. I have taught a lot of students to use Python for data analysis, including data science. In practice these beginners (and there are lot of them) will not need to (consciously) make virtualenvs for a while - because the standard Python data stack is pretty stable, and doesn't generate many dependency conflicts. I'd estimate these beginners can make six months or more of progress before they need to start thinking about making and switching virtualenvs. I think it will be a significant barrier if they have to learn virtualenvs and their structure before they install their first Python package. So I would love a solution like @potiuk's - where the user can ask for, or even just be given, a default virtualenv, into which they install their packages. Of course the --user install provides something like this, but I agree with @potiuk - a default virtualenv is preferable - and in fact I already found myself making that suggestion over at this homebrew discussion. I really care about this, so I am very happy to help with anything I can to test or even build (when term finishes in a couple of weeks).

@matthew-brett
Copy link

matthew-brett commented Mar 16, 2024

In case it's helpful - the use-case I have in mind is - I believe - very common - and that is the beginner starting a class or an online tutorial. For example, consider this tutorial:

https://realpython.com/pygame-a-primer/

It has the instructions, right at the top, of:

pip install pygame

Or - for my students, I suggest:

pip install jupyterlab numpy scipy matplotlib

In both cases, this gets the beginner going with early Python work. But - with various installation methods - including uv pip as stands, outside Conda, this gives e.g.:

$ uv pip install pygame
error: Failed to locate a virtualenv or Conda environment (checked: `VIRTUAL_ENV`, `CONDA_PREFIX`, and `.venv`). Run `uv venv` to create a virtualenv.

Of course this isn't going to worry an experienced user, but it will be confusing to a beginner, who does not know what a virtualenv is - and - for a beginner, who doesn't understand Python installation structure, the virtualenv is a relatively advanced subject.

So - it would be very good to have some default set up, such that the beginner would not face this learning curve immediately, but face it later, when their needs and experience have expanded.

I'm posting here because I have, up until now - suggested my students do e.g. pip install --user pygame as a workaround, but I can see the arguments against doing that.

@stefanv
Copy link

stefanv commented Mar 25, 2024

Maybe the topic, of whether or not to add a --user flag, obfuscates the main concern here:

New users, as @matthew-brett mentions, are often given "pip install x" instructions. This should ideally just work, and definitely not raise obscure errors that confuse new users who know nothing about virtual envs.

Ideally, I'd imagine something like: if in venv, install there, otherwise install into a local default venv. The local default venv has access to system packages, but shadows them, so that the user can upgrade packages.

This would make for a simple first user experience, compatible with all tutorials out there, without impacting sophisticated users who can set up their own venvs.

@potiuk
Copy link
Author

potiuk commented Mar 25, 2024

This should ideally just work,

Yep. My proposal is that it will just install things in .local (or equivalent on other systems) venv created and used automatically by uv in this case. Bonus point to print warning if .local/bin is not on PATH

@Malix-off
Copy link

I would definitely love that

When you're already working in a containerized development environment (nix shell / devcontainer …), it makes perfect sense (and even should be the default)

@matthew-brett
Copy link

Yep. My proposal is that it will just install things in .local (or equivalent on other systems) venv created and used automatically by uv in this case. Bonus point to print warning if .local/bin is not on PATH

I wonder though, whether anyone would worry about previous --user installs either being overwritten, or appearing unexpectedly in the uv environment? And where would the packages go? In .local/site-packages or somewhere else?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compatibility Compatibility with another interface e.g. `pip` great writeup A wonderful example of a quality contribution 💜
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants