Skip to content

Commit

Permalink
Change importlib to first try to import modules using the standard me…
Browse files Browse the repository at this point in the history
…chanism

As detailed in pytest-dev#11475 (comment), currently with `--import-mode=importlib` pytest will try to import every file by using a unique module name, regardless if that module could be imported using the normal import mechanism without touching `sys.path`.

This has the consequence that non-test modules available in `sys.path` (via other mechanism, such as being installed into a virtualenv, PYTHONPATH, etc) would end up being imported as standalone modules, instead of imported with their expected module names.

To illustrate:

```
.env/
  lib/
    site-packages/
      anndata/
        core.py
```

Given `anndata` is installed into the virtual environment, `python -c "import anndata.core"` works, but pytest with `importlib` mode would import that module as a standalone module named `".env.lib.site-packages.anndata.core"`, because importlib module was designed to import test files which are not reachable from `sys.path`, but now it is clear that normal modules should be imported using the standard mechanisms if possible.

Now `imporlib` mode will first try to import the module normally, without changing `sys.path`, and if that fails it falls back to importing the module as a standalone module.

This supersedes pytest-dev#11931.

Fix pytest-dev#11475
Close pytest-dev#11931
  • Loading branch information
nicoddemus committed Feb 18, 2024
1 parent 5f241f3 commit 5d90d97
Show file tree
Hide file tree
Showing 4 changed files with 250 additions and 24 deletions.
6 changes: 4 additions & 2 deletions doc/en/explanation/goodpractices.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,8 +60,10 @@ Within Python modules, ``pytest`` also discovers tests using the standard
:ref:`unittest.TestCase <unittest.TestCase>` subclassing technique.


Choosing a test layout / import rules
-------------------------------------
.. _`test layout`:

Choosing a test layout
----------------------

``pytest`` supports two common test layouts:

Expand Down
61 changes: 50 additions & 11 deletions doc/en/explanation/pythonpath.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,13 @@ import process can be controlled through the ``--import-mode`` command-line flag
these values:

* ``prepend`` (default): the directory path containing each module will be inserted into the *beginning*
of :py:data:`sys.path` if not already there, and then imported with the :func:`importlib.import_module <importlib.import_module>` function.
of :py:data:`sys.path` if not already there, and then imported with
the :func:`importlib.import_module <importlib.import_module>` function.

This requires test module names to be unique when the test directory tree is not arranged in
packages, because the modules will put in :py:data:`sys.modules` after importing.
It is highly recommended to arrange your test modules as packages by adding ``__init__.py`` files to your directories
containing tests. If the test directory tree is not arranged as packages, then each test file needs to have a unique name
compared to the other test files, because the modules will put in :py:data:`sys.modules` after importing
therefore they need to be globally unique.

This is the classic mechanism, dating back from the time Python 2 was still supported.

Expand All @@ -38,26 +41,62 @@ these values:
the tests will run against the installed version
of ``pkg_under_test`` when ``--import-mode=append`` is used whereas
with ``prepend`` they would pick up the local version. This kind of confusion is why
we advocate for using :ref:`src <src-layout>` layouts.
we advocate for using :ref:`src-layouts <src-layout>`.

Same as ``prepend``, requires test module names to be unique when the test directory tree is
not arranged in packages, because the modules will put in :py:data:`sys.modules` after importing.

* ``importlib``: new in pytest-6.0, this mode uses more fine control mechanisms provided by :mod:`importlib` to import test modules. This gives full control over the import process, and doesn't require changing :py:data:`sys.path`.
* ``importlib``: this mode uses more fine control mechanisms provided by :mod:`importlib` to import test modules, without changing :py:data:`sys.path`.

For this reason this doesn't require test module names to be unique.
It works like this:

One drawback however is that test modules are non-importable by each other. Also, utility
modules in the tests directories are not automatically importable because the tests directory is no longer
added to :py:data:`sys.path`.
1. Given a certain module path, for example ``tests/core/test_models.py``, derives a canonical name
like ``tests.core.test_models`` and tries to import it.

Initially we intended to make ``importlib`` the default in future releases, however it is clear now that
it has its own set of drawbacks so the default will remain ``prepend`` for the foreseeable future.
For test modules this will usually fail, as test modules are not usually importable, but will work for modules that are installed into a virtualenv.

For non-test modules this also will usually work if they are installed/accessible via ``sys.path``, so
for example ``.env/lib/site-packages/app/core.py`` will be importable as ``app.core``. This is desirable
in general, and happens when plugins import non-test modules for other purposes, for example doctesting.

If this step succeeds, the module is returned.

2. If the previous step fails, we import the module directly using ``importlib`` facilities, which lets us import it without
changing :py:data:`sys.path`.

Because Python requires the module to also be available in ``sys.modules``, pytest derives unique name for it based
on its relative location from the ``rootdir`` (for example ``tests.core.test_models``)
and adds the module to ``sys.modules``.

Advantages of this mode:

* pytest will not change ``sys.path`` at all.
* Test module names do not need to be unique -- pytest will generate a unique name automatically based on the ``rootdir``.

Disadvantages:

* Test modules are non-importable by each other.
* Testing utility modules (for example a ``tests.helpers`` module containing test-related functions/classes)
in the tests directories are not importable because the tests directory is no longer
added to :py:data:`sys.path`.

Note that by "test utility modules" we mean functions/classes which are imported by
other tests directly; this does not include fixtures, which should be placed in ``conftest.py`` files and are
discovered automatically by pytest.

.. versionadded:: 6.0

.. note::

Initially we intended to make ``importlib`` the default in future releases, however it is clear now that
it has its own set of drawbacks so the default will remain ``prepend`` for the foreseeable future.

.. seealso::

The :confval:`pythonpath` configuration variable.

:ref:`test layout`.


``prepend`` and ``append`` import modes scenarios
-------------------------------------------------
Expand Down
70 changes: 59 additions & 11 deletions src/_pytest/pathlib.py
Original file line number Diff line number Diff line change
Expand Up @@ -522,6 +522,26 @@ def import_path(
raise ImportError(path)

if mode is ImportMode.importlib:
# Obtain the canonical name of this path if we can resolve it to a python package,
# and try to import it without changing sys.path.
# If it works, we import it and return the module.
_, module_name = resolve_pkg_root_and_module_name(path)
try:
importlib.import_module(module_name)
except (ImportError, ImportWarning):
# Cannot be imported normally with the current sys.path, so fallback
# to the more complex import-path mechanism.
pass
else:
# Double check that the imported module is the one we
# were given, otherwise it is easy to import modules with common names like "test"
# from another location.
mod = sys.modules[module_name]
if mod.__file__ and Path(mod.__file__) == path:
return mod

# Could not import the module with the current sys.path, so we fall back
# to importing the file as a standalone module, not being a part of a package.
module_name = module_name_from_path(path, root)
with contextlib.suppress(KeyError):
return sys.modules[module_name]
Expand All @@ -541,16 +561,7 @@ def import_path(
insert_missing_modules(sys.modules, module_name)
return mod

pkg_path = resolve_package_path(path)
if pkg_path is not None:
pkg_root = pkg_path.parent
names = list(path.with_suffix("").relative_to(pkg_root).parts)
if names[-1] == "__init__":
names.pop()
module_name = ".".join(names)
else:
pkg_root = path.parent
module_name = path.stem
pkg_root, module_name = resolve_pkg_root_and_module_name(path)

# Change sys.path permanently: restoring it at the end of this function would cause surprising
# problems because of delayed imports: for example, a conftest.py file imported by this function
Expand Down Expand Up @@ -628,7 +639,14 @@ def module_name_from_path(path: Path, root: Path) -> str:
if len(path_parts) >= 2 and path_parts[-1] == "__init__":
path_parts = path_parts[:-1]

return ".".join(path_parts)
module_name = ".".join(path_parts)
# Modules starting with "." are considered relative, but given we
# are returning a made-up path that is intended to be imported as a global package and
# not as a relative module, replace the "." at the start with "_", which should be enough
# for our purposes.
if module_name.startswith("."):
module_name = "_" + module_name[1:]
return module_name


def insert_missing_modules(modules: Dict[str, ModuleType], module_name: str) -> None:
Expand Down Expand Up @@ -689,6 +707,36 @@ def resolve_package_path(path: Path) -> Optional[Path]:
return result


def resolve_pkg_root_and_module_name(path: Path) -> Tuple[Path, str]:
"""
Return the path to the directory of the root package that contains the
given Python file, and its module name:
src/
app/
__init__.py
core/
__init__.py
models.py
Passing the full path to `models.py` will yield Path("src") and "app.core.models".
If the given path does not belong to a package (missing __init__.py files), returns
just the parent path and the name of the file as module name.
"""
pkg_path = resolve_package_path(path)
if pkg_path is not None:
pkg_root = pkg_path.parent
names = list(path.with_suffix("").relative_to(pkg_root).parts)
if names[-1] == "__init__":
names.pop()
module_name = ".".join(names)
else:
pkg_root = path.parent
module_name = path.stem
return pkg_root, module_name


def scandir(
path: Union[str, "os.PathLike[str]"],
sort_key: Callable[["os.DirEntry[str]"], object] = lambda entry: entry.name,
Expand Down
137 changes: 137 additions & 0 deletions testing/test_pathlib.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
import os.path
from pathlib import Path
import pickle
import shutil
import sys
from textwrap import dedent
from types import ModuleType
Expand All @@ -25,6 +26,7 @@
from _pytest.pathlib import maybe_delete_a_numbered_dir
from _pytest.pathlib import module_name_from_path
from _pytest.pathlib import resolve_package_path
from _pytest.pathlib import resolve_pkg_root_and_module_name
from _pytest.pathlib import safe_exists
from _pytest.pathlib import symlink_or_skip
from _pytest.pathlib import visit
Expand Down Expand Up @@ -598,6 +600,29 @@ def test_module_name_from_path(self, tmp_path: Path) -> None:
result = module_name_from_path(tmp_path / "__init__.py", tmp_path)
assert result == "__init__"

# Modules which start with "." are considered relative and will not be imported
# unless part of a package, so we replace it with a "_" when generating the fake module name.
result = module_name_from_path(tmp_path / ".env/tests/test_foo.py", tmp_path)
assert result == "_env.tests.test_foo"

def test_resolve_pkg_root_and_module_name(self, tmp_path: Path) -> None:
# Create a directory structure first without __init__.py files.
(tmp_path / "src/app/core").mkdir(parents=True)
models_py = tmp_path / "src/app/core/models.py"
models_py.touch()
assert resolve_pkg_root_and_module_name(models_py) == (
models_py.parent,
"models",
)

# Create the __init__.py files, it should now resolve to a proper module name.
(tmp_path / "src/app/__init__.py").touch()
(tmp_path / "src/app/core/__init__.py").touch()
assert resolve_pkg_root_and_module_name(models_py) == (
tmp_path / "src",
"app.core.models",
)

def test_insert_missing_modules(
self, monkeypatch: MonkeyPatch, tmp_path: Path
) -> None:
Expand Down Expand Up @@ -669,6 +694,34 @@ def __init__(self) -> None:
mod = import_path(init, root=tmp_path, mode=ImportMode.importlib)
assert len(mod.instance.INSTANCES) == 1

def test_importlib_doctest(self, monkeypatch: MonkeyPatch, tmp_path: Path) -> None:
"""
Importing a package using --importmode=importlib should
import the package using the canonical name
"""
proj_dir = tmp_path / "proj"
proj_dir.mkdir()
pkgs_dir = tmp_path / "pkgs"
pkgs_dir.mkdir()
monkeypatch.chdir(proj_dir)
monkeypatch.syspath_prepend(pkgs_dir)
# this is also there, but shouldn't be imported from
monkeypatch.syspath_prepend(proj_dir)

package_name = "importlib_doctest"
# pkgs_dir is second to set `init`
for directory in [proj_dir / "src", pkgs_dir]:
pkgdir = directory / package_name
pkgdir.mkdir(parents=True)
init = pkgdir / "__init__.py"
init.write_text("", encoding="ascii")

# the PyTest root is `proj_dir`, but the package is imported from `pkgs_dir`
mod = import_path(init, root=proj_dir, mode=ImportMode.importlib)
# assert that it's imported with the canonical name, not “path.to.package.<name>”
mod_names = [n for n, m in sys.modules.items() if m is mod]
assert mod_names == ["importlib_doctest"]

def test_importlib_root_is_package(self, pytester: Pytester) -> None:
"""
Regression for importing a `__init__`.py file that is at the root
Expand All @@ -685,6 +738,90 @@ def test_my_test():
result = pytester.runpytest("--import-mode=importlib")
result.stdout.fnmatch_lines("* 1 passed *")

def create_installed_doctests_and_tests_dir(
self, path: Path, monkeypatch: MonkeyPatch
) -> None:
"""
Create a directory structure where the application code is installed in a virtual environment,
and the tests are in an outside ".tests" directory.
"""
app = path / "src/app"
app.mkdir(parents=True)
(app / "__init__.py").touch()
core_py = app / "core.py"
core_py.write_text(
dedent(
"""
def foo():
'''
>>> 1 + 1
2
'''
"""
),
encoding="ascii",
)

# Install it into a site-packages directory, and add it to sys.path, mimicking what
# happens when installing into a virtualenv.
site_packages = path / ".env/lib/site-packages"
site_packages.mkdir(parents=True)
shutil.copytree(app, site_packages / "app")
assert (site_packages / "app/core.py").is_file()

monkeypatch.syspath_prepend(site_packages)

# Create the tests file, outside 'src' and the virtualenv.
test_path = path / ".tests/test_core.py"
test_path.parent.mkdir(parents=True)
test_path.write_text(
"import app.core\n\ndef test(): pass",
encoding="ascii",
)

def test_import_using_normal_mechanism_first(
self, monkeypatch: MonkeyPatch, pytester: Pytester
) -> None:
"""
Test import_path imports from the canonical location when possible first, only
falling back to its normal flow when the module being imported is not reachable via sys.path (#11475).
"""
self.create_installed_doctests_and_tests_dir(pytester.path, monkeypatch)

# Imported from installed location via sys.path.
core_py = pytester.path / ".env/lib/site-packages/app/core.py"
assert core_py.is_file()
mod = import_path(core_py, mode="importlib", root=pytester.path)
assert mod.__name__ == "app.core"
assert mod.__file__ and Path(mod.__file__) == core_py

# Imported as a standalone module.
# Instead of '.tests.test_core', we import as "_tests.test_core" because
# importlib considers module names starting with '.' to be local imports.
test_path = pytester.path / ".tests/test_core.py"
assert test_path.is_file()
mod = import_path(test_path, mode="importlib", root=pytester.path)
assert mod.__name__ == "_tests.test_core"

def test_import_using_normal_mechanism_first_integration(
self, monkeypatch: MonkeyPatch, pytester: Pytester
) -> None:
"""
Same test as above, but verify the behavior calling pytest.
We should not make this call in the same test as above, as the modules have already
been imported by separate import_path() calls.
"""
self.create_installed_doctests_and_tests_dir(pytester.path, monkeypatch)
result = pytester.runpytest(
"--import-mode=importlib",
"--doctest-modules",
"--pyargs",
"app",
"./.tests",
)
assert result.parseoutcomes() == {"passed": 2}


def test_safe_exists(tmp_path: Path) -> None:
d = tmp_path.joinpath("some_dir")
Expand Down

0 comments on commit 5d90d97

Please sign in to comment.