Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ModuleNotFoundError when using --doctest-modules, a src/ layout, and --import-mode=importlib (and no editable mode install) #11475

Closed
flying-sheep opened this issue Oct 2, 2023 · 29 comments · Fixed by #11997
Labels
topic: collection related to the collection phase

Comments

@flying-sheep
Copy link
Contributor

flying-sheep commented Oct 2, 2023

reproducer

We’d like to run our tests on an installed package instead of the development version, as we delete some files in our build process. We therefore don’t use editable installs.

I think the problem is that there seems to be no way to configure the import root used with the importlib import mode. The project rootdir gets passed here instead of something user configurable:

module = import_path(
self.path,
root=self.config.rootpath,
mode=self.config.getoption("importmode"),
)

Which means that the following code will run module_name = module_name_from_path('<rootdir>/src/anndata/_types.py', '<rootdir>'), i.e. module_name = 'src.anndata._types' when it should be 'anndata._types'

if mode is ImportMode.importlib:
module_name = module_name_from_path(path, root)

Without src/ layout, this works accidentally, as the module_name happens to match the real module name, and the module only gets imported once.

Pytest 7.4.2

@richardxia
Copy link

I just wanted to add that we are hitting this issue as well, but we are using editable installs, so it looks like this issue affects both the case with editable and non-editable installs of the project source files.

Just to be clear, our project is:

  • an editable installation
  • has a src/ layout
  • puts non-doctest tests outside the application code, in a tests/ directory that is a sibling to the src/ directory
  • uses --import-mode=importlib
  • (attempts to) use --doctest-modules

We more or less exactly follow the suggested layout and settings in https://docs.pytest.org/en/7.1.x/explanation/goodpractices.html#tests-outside-application-code, and everything was working fine until attempting to run doctests within the application code using --doctest-modules.

@flying-sheep
Copy link
Contributor Author

Yeah, could this please be added to the 8.0 milestone?

This might require a new setting or some breaking change, so this is the ideal moment

@bluetech
Copy link
Member

Does it work better if you use pytest --doctest-modules --import-mode=importlib --pyargs my.package?

@flying-sheep
Copy link
Contributor Author

flying-sheep commented Jan 12, 2024

Actually yes! It’s getting really complicated though:

[tool.pytest.ini_options]
addopts = [
    "--strict-markers",
    "--doctest-modules",
    "--pyargs",
]
testpaths = [
    "anndata", # doctests in docstrings (module name due to --pyargs)
    "./src/anndata/tests", # unit tests
    "./docs/concatenation.rst", # further doctests
]

and then in src/anndata/tests/conftest.py a collect_ignore = ["helpers.py"]. But it works.

@bluetech
Copy link
Member

In the issue subject you write --import-mode=importlib but you are not using that right?

If you give me a bit of instructions how to reproduce this locally, I can take a look.

@flying-sheep
Copy link
Contributor Author

flying-sheep commented Jan 12, 2024

Sorry, I muddled my project (scverse/anndata#1151) too much with this issue. I’m trying to be clear now.

Adding --import-mode=importlib to the above results in this:

E   ModuleNotFoundError: No module named
  'home.phil..local.share.hatch.[…].site-packages.anndata._core';
  'home.phil..local.share.hatch.[…].site-packages.anndata' is not a package

What needs to work for this issue to be considered fixed:

+ src/my_module/__init__.py  # containing a doctest
+ tests/test_basic.py
[build-system]
build-backend = "hatchling.build"
requires = ["hatchling", "hatch-vcs"]

[project]
name = "my_module"
version = "0.1.0"

[tool.pytest.ini_options]
addopts = [
    "--import-mode=importlib",
    "--strict-markers",
    "--doctest-modules",
    "--pyargs",
]
testpaths = [
    "my_module",  # or without `--pyargs`: "./src/my_module"
    "./tests",
]
What would be nice if it worked for migrating our project:
+ src/my_module/
  + __init__.py  # containing a doctest
  + tests/
    + conftest.py  # containing `collect_ignore = ["helpers.py"]`
    + helpers.py
    + test_basic.py
...

[tool.pytest.ini_options]
addopts = [
    "--import-mode=importlib",
    "--strict-markers",
    "--doctest-modules",
    "--pyargs",
]
testpaths = [
    "my_module",
    "./src/my_module/tests",
]

@flying-sheep
Copy link
Contributor Author

flying-sheep commented Jan 12, 2024

Very weird, I just can’t find a good reproducer. In our project, it fails: https://dev.azure.com/scverse/anndata/_build/results?buildId=5213&view=logs&jobId=5ea502cf-d418-510c-3b5f-c4ba606ae534&j=5ea502cf-d418-510c-3b5f-c4ba606ae534&t=5091afe0-1787-5ef7-c6a1-ce6dc06b30a7

platform linux -- Python 3.11.7, pytest-7.4.4, pluggy-1.3.0
rootdir: /home/vsts/work/1/s
configfile: pyproject.toml
testpaths: anndata, ./src/anndata/tests, ./docs/concatenation.rst
plugins: xdist-3.5.0, memray-1.5.0, anyio-4.2.0, cov-4.1.0
collected 3028 items / 20 errors

==================================== ERRORS ====================================
__________________________ ERROR collecting utils.py ___________________________
/opt/hostedtoolcache/Python/3.11.7/x64/lib/python3.11/site-packages/anndata/utils.py:13: in <module>
    from ._core.sparse_dataset import BaseCompressedSparseDataset
E   ModuleNotFoundError: No module named 'opt.hostedtoolcache.Python.3.11.7.x64.lib.python3.11.site-packages.anndata._core'; 'opt.hostedtoolcache.Python.3.11.7.x64.lib.python3.11.site-packages.anndata' is not a package
__________________ ERROR collecting _core/aligned_mapping.py ___________________
/opt/hostedtoolcache/Python/3.11.7/x64/lib/python3.11/site-packages/anndata/_core/aligned_mapping.py:23: in <module>
    from ..utils import deprecated, dim_len, ensure_df_homogeneous, warn_once
E   ImportError: cannot import name 'deprecated' from 'opt.hostedtoolcache.Python.3.11.7.x64.lib.python3.11.site-packages.anndata.utils' (/opt/hostedtoolcache/Python/3.11.7/x64/lib/python3.11/site-packages/anndata/utils.py)
...

but using the “reproducer” I so confidently outlined above, everything works.

@flying-sheep
Copy link
Contributor Author

flying-sheep commented Jan 12, 2024

OK, I minified the repo somewhat. I assume the problem is that even with --pyargs, pytest tries to import the module with weird names while running doctest.

https://github.com/flying-sheep/pytest-doctest-import-mismatch

I see errors like this:

____ ERROR collecting _core/merge.py ____
/home/phil/.local/share/hatch/env/virtual/anndata/kX3YdB0h/anndata/lib/python3.11/site-packages/anndata/_core/merge.py:28: in <module>
   from ..compat import _map_cat_to_str
E   ModuleNotFoundError: No module named 'home.phil..local.share.hatch.env.virtual.anndata.kX3YdB0h.anndata.lib.python3.11.site-packages.anndata.compat'; 'home.phil..local.share.hatch.env.virtual.anndata.kX3YdB0h.anndata.lib.python3.11.site-packages.anndata' is not a package
____ ERROR collecting _core/raw.py ____
/home/phil/.local/share/hatch/env/virtual/anndata/kX3YdB0h/anndata/lib/python3.11/site-packages/anndata/_core/raw.py:9: in <module>
   from .aligned_mapping import AxisArrays
E   ImportError: cannot import name 'AxisArrays' from 'home.phil..local.share.hatch.env.virtual.anndata.kX3YdB0h.anndata.lib.python3.11.site-packages.anndata._core.aligned_mapping' (/home/phil/.local/share/hatch/env/virtual/anndata/kX3YdB0h/anndata/lib/python3.11/site-packages/anndata/_core/aligned_mapping.py)

The reproducer is by no means minimal yet, and I can make it smaller for sure, but maybe it’s helpful before i get to that already.

@bluetech
Copy link
Member

(First, consider editing the original post to point to the reproduction; the ImportPathMismatchError error stated there cannot happen in --import-mode=importlib so it's quite confusing).

Thanks for creating a reproduction. I tried it now. I was able to reproduce with pytest 7.4, but with pytest main it works (only if using venv name venv instead of .venv -- see below). I bisected the fix to commit 194a782 which is also included in 8.0.0rc1.

However, there's an issue I noticed: If the path contains dots, e.g. .venv and python3.11 in this case, then --import-mode=importlib fails like this:

Dot problem

__________________________________________________ ERROR collecting .venv/lib/python3.11/site-packages/anndata/__init__.py ___________________________________________________
.venv/lib/python3.11/site-packages/anndata/__init__.py:11: in <module>
    from ._core.anndata import AnnData
E   ValueError: Empty module name
________________________________________________ ERROR collecting .venv/lib/python3.11/site-packages/anndata/_core/access.py _________________________________________________
/usr/lib/python3.11/importlib/__init__.py:121: in import_module
    raise TypeError(msg.format(name))
E   TypeError: the 'package' argument is required to perform a relative import for '.venv.lib.python3.11.site-packages'
____________________________________________ ERROR collecting .venv/lib/python3.11/site-packages/anndata/_core/aligned_mapping.py ____________________________________________
/usr/lib/python3.11/importlib/__init__.py:121: in import_module
    raise TypeError(msg.format(name))
E   TypeError: the 'package' argument is required to perform a relative import for '.venv.lib.python3.11.site-packages'
________________________________________________ ERROR collecting .venv/lib/python3.11/site-packages/anndata/_core/anndata.py ________________________________________________
/usr/lib/python3.11/importlib/__init__.py:121: in import_module
    raise TypeError(msg.format(name))
E   TypeError: the 'package' argument is required to perform a relative import for '.venv.lib.python3.11.site-packages'
_________________________________________________ ERROR collecting .venv/lib/python3.11/site-packages/anndata/_core/merge.py _________________________________________________
/usr/lib/python3.11/importlib/__init__.py:121: in import_module
    raise TypeError(msg.format(name))
E   TypeError: the 'package' argument is required to perform a relative import for '.venv.lib.python3.11.site-packages'
_____________________________________________________ ERROR collecting .venv/lib/python3.11/site-packages/anndata/tests ______________________________________________________
/usr/lib/python3.11/importlib/__init__.py:121: in import_module
    raise TypeError(msg.format(name))
E   TypeError: the 'package' argument is required to perform a relative import for '.venv.lib.python3.11.site-packages.anndata.tests'

Python module names (individual components) cannot contain dots, and the --import-mode=importlib code does stuff like module_name.split(".").

I tried "escaping" the dots by replacing them with _dot_, which allows things to continue, however then it fails like this:

Exec module problem

self = <DoctestModule __init__.py>

    def collect(self) -> Iterable[DoctestItem]:
        import doctest
    
        class MockAwareDocTestFinder(doctest.DocTestFinder):
            """A hackish doctest finder that overrides stdlib internals to fix a stdlib bug.
    
            https://github.com/pytest-dev/pytest/issues/3456
            https://bugs.python.org/issue25532
            """
    
            def _find_lineno(self, obj, source_lines):
                """Doctest code does not take into account `@property`, this
                is a hackish way to fix it. https://bugs.python.org/issue17446
    
                Wrapped Doctests will need to be unwrapped so the correct
                line number is returned. This will be reported upstream. #8796
                """
                if isinstance(obj, property):
                    obj = getattr(obj, "fget", obj)
    
                if hasattr(obj, "__wrapped__"):
                    # Get the main obj in case of it being wrapped
                    obj = inspect.unwrap(obj)
    
                # Type ignored because this is a private function.
                return super()._find_lineno(  # type:ignore[misc]
                    obj,
                    source_lines,
                )
    
            def _find(
                self, tests, obj, name, module, source_lines, globs, seen
            ) -> None:
                if _is_mocked(obj):
                    return
                with _patch_unwrap_mock_aware():
                    # Type ignored because this is a private function.
                    super()._find(  # type:ignore[misc]
                        tests, obj, name, module, source_lines, globs, seen
                    )
    
            if sys.version_info < (3, 13):
    
                def _from_module(self, module, object):
                    """`cached_property` objects are never considered a part
                    of the 'current module'. As such they are skipped by doctest.
                    Here we override `_from_module` to check the underlying
                    function instead. https://github.com/python/cpython/issues/107995
                    """
                    if isinstance(object, functools.cached_property):
                        object = object.func
    
                    # Type ignored because this is a private function.
                    return super()._from_module(module, object)  # type: ignore[misc]
    
            else:  # pragma: no cover
                pass
    
        if self.path.name == "conftest.py":
            module = self.config.pluginmanager._importconftest(
                self.path,
                self.config.getoption("importmode"),
                rootpath=self.config.rootpath,
            )
        else:
            try:
>               module = import_path(
                    self.path,
                    root=self.config.rootpath,
                    mode=self.config.getoption("importmode"),
                )

../../src/pytest/src/_pytest/doctest.py:569: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

p = PosixPath('/home/ran/tmp/pytest-doctest-import-mismatch/.venv/lib/python3.11/site-packages/anndata/__init__.py')

    def import_path(
        p: Union[str, "os.PathLike[str]"],
        *,
        mode: Union[str, ImportMode] = ImportMode.prepend,
        root: Path,
    ) -> ModuleType:
        """Import and return a module from the given path, which can be a file (a module) or
        a directory (a package).
    
        The import mechanism used is controlled by the `mode` parameter:
    
        * `mode == ImportMode.prepend`: the directory containing the module (or package, taking
          `__init__.py` files into account) will be put at the *start* of `sys.path` before
          being imported with `importlib.import_module`.
    
        * `mode == ImportMode.append`: same as `prepend`, but the directory will be appended
          to the end of `sys.path`, if not already in `sys.path`.
    
        * `mode == ImportMode.importlib`: uses more fine control mechanisms provided by `importlib`
          to import the module, which avoids having to muck with `sys.path` at all. It effectively
          allows having same-named test modules in different places.
    
        :param root:
            Used as an anchor when mode == ImportMode.importlib to obtain
            a unique name for the module being imported so it can safely be stored
            into ``sys.modules``.
    
        :raises ImportPathMismatchError:
            If after importing the given `path` and the module `__file__`
            are different. Only raised in `prepend` and `append` modes.
        """
        mode = ImportMode(mode)
    
        path = Path(p)
    
        if not path.exists():
            raise ImportError(path)
    
        if mode is ImportMode.importlib:
            module_name = module_name_from_path(path, root)
            with contextlib.suppress(KeyError):
                return sys.modules[module_name]
    
            for meta_importer in sys.meta_path:
                spec = meta_importer.find_spec(module_name, [str(path.parent)])
                if spec is not None:
                    break
            else:
                spec = importlib.util.spec_from_file_location(module_name, str(path))
    
            if spec is None:
                raise ImportError(f"Can't find module {module_name} at location {path}")
            mod = importlib.util.module_from_spec(spec)
            sys.modules[module_name] = mod
>           spec.loader.exec_module(mod)  # type: ignore[union-attr]

../../src/pytest/src/_pytest/pathlib.py:540: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <_frozen_importlib_external.SourceFileLoader object at 0x7feac0a280d0>
module = <module '_dot_venv.lib.python3_dot_11.site-packages.anndata' from '/home/ran/tmp/pytest-doctest-import-mismatch/.venv/lib/python3.11/site-packages/anndata/__init__.py'>

>   ???

<frozen importlib._bootstrap_external>:940: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

f = <built-in function exec>
args = (<code object <module> at 0x7feac0c24ff0, file "/home/ran/tmp/pytest-doctest-import-mismatch/.venv/lib/python3.11/site...__file__': '/home/ran/tmp/pytest-doctest-import-mismatch/.venv/lib/python3.11/site-packages/anndata/__init__.py', ...})
kwds = {}

>   ???

<frozen importlib._bootstrap>:241: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

    """Annotated multivariate observation data."""
    from __future__ import annotations
    
    # Allowing notes to be added to exceptions. See: https://github.com/scverse/anndata/issues/868
    import sys
    
    if sys.version_info < (3, 11):
        # Backport package for exception groups
        import exceptiongroup  # noqa: F401
    
>   from ._core.anndata import AnnData
E   ModuleNotFoundError: No module named '_dot_venv'

.venv/lib/python3.11/site-packages/anndata/__init__.py:11: ModuleNotFoundError

This is possibly another problem with --import-mode=importlib; if I swap the order of the exec_module and the insert_missing_modules statements then it does work, but I'm not familiar enough the code to know if it's correct.

@nicoddemus Do you have any idea about this?

@flying-sheep flying-sheep changed the title ImportMismatchError when using pip install without -e, --doctest-modules, a src/ layout, and --import-mode=importlib ModuleNotFoundError when using pip install without -e, --doctest-modules, a src/ layout, and --import-mode=importlib Jan 15, 2024
@flying-sheep
Copy link
Contributor Author

First, consider editing the original post to point to the reproduction; the ImportPathMismatchError error stated there cannot happen in --import-mode=importlib so it's quite confusing

fixed

Thanks for creating a reproduction. I tried it now. I was able to reproduce with pytest 7.4, but with pytest main it works

Well, unless I’m missing something, the same package is still imported twice with this approach, right?

Which might cause problems similar to #11306 when the two versions try to access some shared resource. (e.g. let’s say they both open a database connection or write a file on import time – not super good style, but possible)

@flying-sheep
Copy link
Contributor Author

@nicoddemus any idea?

@flying-sheep flying-sheep changed the title ModuleNotFoundError when using pip install without -e, --doctest-modules, a src/ layout, and --import-mode=importlib ModuleNotFoundError when using pip install with --doctest-modules, a src/ layout, --import-mode=importlib, and without -e Feb 5, 2024
@flying-sheep flying-sheep changed the title ModuleNotFoundError when using pip install with --doctest-modules, a src/ layout, --import-mode=importlib, and without -e ModuleNotFoundError when using --doctest-modules, a src/ layout, and --import-mode=importlib (and no editable mode install) Feb 5, 2024
@flying-sheep
Copy link
Contributor Author

I think no “escaping” hackery is necessary. We run pytest --pyargs <modulename>, so the parent module of any doctest should start with <modulename> and not _dot_venv or opt.hostedtoolcache.Python*site-packages.

So how about we just infer the correct module name of the parent module using sys.path by enforcing that it matches the module name supplied using --pyargs?

@nicoddemus
Copy link
Member

nicoddemus commented Feb 10, 2024

Sorry for the long post, spent the last 3 hours working on this and I'm afraid there's no simple "fix" for this case.

The short answer is that pytest uses --import-mode=importlib for all imports, not only tests and conftests as originally intended.

The introduction of importlib mode was meant to give an alternative to the other modes so that:

  1. pytest does not need change sys.path to import tests and conftest files.
  2. Different test modules can have the same base name, without __init__.py files.

(Full explanation in the docs).

On a non-src layout:

pkg/__init__.py
tests/
  test_foo.py

One of the reasons why changing sys.path is problematic in this case is because it will add . to the root directory to import the tests under tests, which has the side effect of making pkg also importable. This can hide problems when trying to test the installed package in .venv, because now we are actually testing the local sources as they are reachable through sys.path during testing, instead of the installed packages in .venv.

On src layouts:

src/pkg/__init__.py
tests/
  test_foo.py

This is not a problem: adding . to sys.path does not make pkg importable, so import pkg will import from the .venv as desired.

The reproducer repository is using importlib mode, I'm guessing because of reason 2 (duplicated test names), or perhaps because at one point we recommended using importlib intending to turn it into the default, however since then we realized this is not possible, as it brings it has its own drawbacks.

@flying-sheep, by changing your configuration slightly I was able to make your reproduce run without problems in pytest main and in 7.4.4:

  • Exclude conftest.py files from being installed: they should be imported by pytest directly.
  • Drop importlib, as you are using src layout. If the problem is due to duplicated test names, just adding __init__.py on the folders with the duplicated tests solves it.
λ pytest
======================== test session starts ========================
platform win32 -- Python 3.11.5, pytest-7.4.4, pluggy-1.4.0
rootdir: e:\projects\pytest-doctest-import-mismatch
configfile: pyproject.toml
testpaths: anndata, ./src/anndata/tests
collected 6 items

.env311\Lib\site-packages\anndata\utils.py .                   [ 16%]
.env311\Lib\site-packages\anndata\_core\anndata.py ..          [ 50%]
.env311\Lib\site-packages\anndata\_core\merge.py ..            [ 83%]
src\anndata\tests\test_base.py .                               [100%]

========================= warnings summary ==========================

Here's the diff:

diff --git a/pyproject.toml b/pyproject.toml
index 81389c5..d15f40a 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -21,12 +21,12 @@ dependencies = [
 
 [tool.hatch.build]
 exclude = [
+    'src/anndata/tests/conftest.py',
     'src/anndata/tests/test_*.py',
 ]
 
 [tool.pytest.ini_options]
 addopts = [
-    '--import-mode=importlib',
     '--doctest-modules',
     '--pyargs',
 ]

Follows is a long description of the overall problem.


The introduction of importlib mode was meant to give an alternative to the other modes so that:

  1. pytest does not need change sys.path to import tests and conftest files.
  2. Different test modules can have the same base name, without __init__.py files.

(Full explanation in the docs).

One important thing in that description is *test modules and conftests.

Before we introduced the importlib mode, all modules directly imported by pytest used the same function, import_path, which changes sys.path to import a module in case it is not already importable. "All modules" in the majority means tests and conftests (as user application code is imported indirectly through these), but it also includes user application code in some cases, doctests being one case relevant here.

The importlib mode works by assuming it is importing a module that cannot be reached from sys.path through the normal means, a common example being the tests directory layout at the root of the package, where we have a test file tests/foo/bar/test_foo.py without any __init__.py files within. It is not importable because tests is not reachable from sys.path, and we also do not want to change it.

To import that file, we:

  1. Try to come up with a "module name" for it based on its full path and root, so for the file tests/foo/bar/test_foo.py we make up an non-existing module name "tests.foo.bar.test_foo". This module name will be unique, and not conflict with other test modules named test_foo.py in the same suite, as they will be in a different directory and will be assigned a different module name using the same mechanism ("tests.schema.test_foo" for example).

  2. Import the module using its full path using importlib.importmodule, and then place that module in sys.modules with the module_name we determined above.

This works well when used to import just test modules and conftests, but will create divergences when used import application code. In the reproducer repo, we are trying to import anndata using importlib mode, so it ends up being imported as a module named using the logic described above, however it is reachable through sys.path and we probably should import it using the normal mechanisms.

Probably one solution is to revisit all places where we manually import modules, and possibly use "importlib" (when configured) for tests+conftests or fallback to the other mechanisms for normal modules, but this feels extremely risky given the millions of test suites out there.

The importlib mode was intended to solve avoid changing sys.path/"ImportMismatch" errors, but particularly if I knew all the problems/details it involves, I would just recommend other solutions:

  1. pytest changes sys.path and it is testing the local copy? Switch to src layout.
  2. Test module names with the same? Add __init__.py files in your tests turning them into packages.

But here we are.

I'm not sure how to proceed here. Perhaps we can right away add a warning on --doctest-modules that it might be problematic to use together with --import-mode=importlib.

I will spend some more time trying to see if there's a simple and safe solution to this general mess.

@flying-sheep
Copy link
Contributor Author

flying-sheep commented Feb 12, 2024

Maybe I missed it when reading your summary, but isn’t my PR a simple and safe solution?

It gets rid of this iffy strategy:

Try to come up with a "module name" for it based on its full path and root, so for the file tests/foo/bar/test_foo.py we make up an non-existing module name "tests.foo.bar.test_foo".

whenever some module does live below one of the sys.path entries.

@nicoddemus
Copy link
Member

I will double check, but @bluetech's concern is valid: search through sys.path entries can be very costly, specially in environment configurations where modules are reachable by configuring PYTHONPATH -- at work we use this, and our sys.path entry will contain dozens of entries.

It does not invalidate it completely of course, but it is something we should consider.

As I said, I will double check it later. 👍

@flying-sheep
Copy link
Contributor Author

flying-sheep commented Feb 13, 2024

I don‘t think it invalidates it in the slightest before someone has measured the impact 😉

Also keep in mind that I outlined some strategies to make it faster.

I don‘t think we should even consider sacrificing correctness before having tried out all other sensible avenues.
Importing modules twice can lead to very subtle and hard to debug breakage, so I feel like everyone waiting 5 seconds more for the test suite would be an amazing trade-off compared to a handful of people debugging some weird double-import bug for 3 hours.

@nicoddemus
Copy link
Member

I don‘t think we should even consider sacrificing correctness before having tried out all other sensible avenues.

Oh I agree 100%, I didn't mean to imply that we were not applying your fix based solely on any performance problems that we have a gut feeling about -- I am just thinking there might be some other solution.

One such solution that occurred to me, and I plan to test, is that when using importlib, we first try to import the module normally, without altering sys.path. If the module ends up being importable, then we just use it; if not, then we fallback with the current importlib shenaningans. If I'm right, this would probably be the simplest and a good solution still aligned with importlib's goal: not changing sys.path and allow duplicated module names.

@nicoddemus
Copy link
Member

I plan to spend more time this week on this, and solve this once and for all (with your fix or something else).

@flying-sheep
Copy link
Contributor Author

flying-sheep commented Feb 15, 2024

that when using importlib, we first try to import the module normally, without altering sys.path.

I’m confused, I thought importlib’s point is that it doesn’t alter sys.path?

I plan to spend more time this week on this, and solve this once and for all (with your fix or something else).

Another possibility is to just be able to make PyTest aware of the module root(s) the project has, e.g. src/. Then that would be used for imports instead of the project root directory.

@nicoddemus
Copy link
Member

I’m confused, I thought importlib’s point is that it doesn’t alter sys.path?

Yep, what I mean is just try to import the module, without changing sys.path: in case we are trying to import a normal module which is installed in the virtualenv, this would just work.

Another possibility is to just be able to make PyTest aware of the module root(s) the project has, e.g. src/. Then that would be used for imports instead of the project root directory.

That would be another solution too.

nicoddemus added a commit to nicoddemus/pytest that referenced this issue Feb 17, 2024
…chanism

As detailed in pytest-dev#11475 (comment), currently with `--import-mode=importlib` pytest will try to import every file by using a unique module name, regardless if that module could be imported using the normal import mechanism without touching `sys.path`.

This has the consequence that non-test modules available in `sys.path` (via other mechanism, such as being installed into a virtualenv, PYTHONPATH, etc) would end up being imported as standalone modules, instead of imported with their expected module names.

To illustrate:

```
.env/
  lib/
    site-packages/
      anndata/
        core.py
```

Given `anndata` is installed into the virtual environment, `python -c "import anndata.core"` works, but pytest with `importlib` mode would import that module as a standalone module named `".env.lib.site-packages.anndata.core"`, because importlib module was designed to import test files which are not reachable from `sys.path`, but now it is clear that normal modules should be imported using the standard mechanisms if possible.

Now `imporlib` mode will first try to import the module normally, without changing `sys.path`, and if that fails it falls back to importing the module as a standalone module.

This supersedes pytest-dev#11931.

Fix pytest-dev#11475
Close pytest-dev#11931
@nicoddemus
Copy link
Member

Proposing a definite solution (IMHO) in #11997. 👍

nicoddemus added a commit to nicoddemus/pytest that referenced this issue Feb 18, 2024
…chanism

As detailed in pytest-dev#11475 (comment), currently with `--import-mode=importlib` pytest will try to import every file by using a unique module name, regardless if that module could be imported using the normal import mechanism without touching `sys.path`.

This has the consequence that non-test modules available in `sys.path` (via other mechanism, such as being installed into a virtualenv, PYTHONPATH, etc) would end up being imported as standalone modules, instead of imported with their expected module names.

To illustrate:

```
.env/
  lib/
    site-packages/
      anndata/
        core.py
```

Given `anndata` is installed into the virtual environment, `python -c "import anndata.core"` works, but pytest with `importlib` mode would import that module as a standalone module named `".env.lib.site-packages.anndata.core"`, because importlib module was designed to import test files which are not reachable from `sys.path`, but now it is clear that normal modules should be imported using the standard mechanisms if possible.

Now `imporlib` mode will first try to import the module normally, without changing `sys.path`, and if that fails it falls back to importing the module as a standalone module.

This supersedes pytest-dev#11931.

Fix pytest-dev#11475
Close pytest-dev#11931
nicoddemus added a commit to nicoddemus/pytest that referenced this issue Feb 25, 2024
…chanism

As detailed in pytest-dev#11475 (comment), currently with `--import-mode=importlib` pytest will try to import every file by using a unique module name, regardless if that module could be imported using the normal import mechanism without touching `sys.path`.

This has the consequence that non-test modules available in `sys.path` (via other mechanism, such as being installed into a virtualenv, PYTHONPATH, etc) would end up being imported as standalone modules, instead of imported with their expected module names.

To illustrate:

```
.env/
  lib/
    site-packages/
      anndata/
        core.py
```

Given `anndata` is installed into the virtual environment, `python -c "import anndata.core"` works, but pytest with `importlib` mode would import that module as a standalone module named `".env.lib.site-packages.anndata.core"`, because importlib module was designed to import test files which are not reachable from `sys.path`, but now it is clear that normal modules should be imported using the standard mechanisms if possible.

Now `imporlib` mode will first try to import the module normally, without changing `sys.path`, and if that fails it falls back to importing the module as a standalone module.

This supersedes pytest-dev#11931.

Fix pytest-dev#11475
Close pytest-dev#11931
nicoddemus added a commit to nicoddemus/pytest that referenced this issue Feb 25, 2024
…chanism

As detailed in pytest-dev#11475 (comment), currently with `--import-mode=importlib` pytest will try to import every file by using a unique module name, regardless if that module could be imported using the normal import mechanism without touching `sys.path`.

This has the consequence that non-test modules available in `sys.path` (via other mechanism, such as being installed into a virtualenv, PYTHONPATH, etc) would end up being imported as standalone modules, instead of imported with their expected module names.

To illustrate:

```
.env/
  lib/
    site-packages/
      anndata/
        core.py
```

Given `anndata` is installed into the virtual environment, `python -c "import anndata.core"` works, but pytest with `importlib` mode would import that module as a standalone module named `".env.lib.site-packages.anndata.core"`, because importlib module was designed to import test files which are not reachable from `sys.path`, but now it is clear that normal modules should be imported using the standard mechanisms if possible.

Now `imporlib` mode will first try to import the module normally, without changing `sys.path`, and if that fails it falls back to importing the module as a standalone module.

This supersedes pytest-dev#11931.

Fix pytest-dev#11475
Close pytest-dev#11931
nicoddemus added a commit to nicoddemus/pytest that referenced this issue Feb 25, 2024
…chanism

As detailed in pytest-dev#11475 (comment), currently with `--import-mode=importlib` pytest will try to import every file by using a unique module name, regardless if that module could be imported using the normal import mechanism without touching `sys.path`.

This has the consequence that non-test modules available in `sys.path` (via other mechanism, such as being installed into a virtualenv, PYTHONPATH, etc) would end up being imported as standalone modules, instead of imported with their expected module names.

To illustrate:

```
.env/
  lib/
    site-packages/
      anndata/
        core.py
```

Given `anndata` is installed into the virtual environment, `python -c "import anndata.core"` works, but pytest with `importlib` mode would import that module as a standalone module named `".env.lib.site-packages.anndata.core"`, because importlib module was designed to import test files which are not reachable from `sys.path`, but now it is clear that normal modules should be imported using the standard mechanisms if possible.

Now `imporlib` mode will first try to import the module normally, without changing `sys.path`, and if that fails it falls back to importing the module as a standalone module.

This supersedes pytest-dev#11931.

Fix pytest-dev#11475
Close pytest-dev#11931
bluetech pushed a commit to bluetech/pytest that referenced this issue Feb 27, 2024
…chanism

As detailed in
pytest-dev#11475 (comment),
currently with `--import-mode=importlib` pytest will try to import every
file by using a unique module name, regardless if that module could be
imported using the normal import mechanism without touching `sys.path`.

This has the consequence that non-test modules available in `sys.path`
(via other mechanism, such as being installed into a virtualenv,
PYTHONPATH, etc) would end up being imported as standalone modules,
instead of imported with their expected module names.

To illustrate:

```
.env/
  lib/
    site-packages/
      anndata/
        core.py
```

Given `anndata` is installed into the virtual environment, `python -c
"import anndata.core"` works, but pytest with `importlib` mode would
import that module as a standalone module named
`".env.lib.site-packages.anndata.core"`, because importlib module was
designed to import test files which are not reachable from `sys.path`,
but now it is clear that normal modules should be imported using the
standard mechanisms if possible.

Now `imporlib` mode will first try to import the module normally,
without changing `sys.path`, and if that fails it falls back to
importing the module as a standalone module.

This also makes `importlib` respect namespace packages.

This supersedes pytest-dev#11931.

Fix pytest-dev#11475
Close pytest-dev#11931
nicoddemus added a commit to nicoddemus/pytest that referenced this issue Mar 2, 2024
…chanism

As detailed in
pytest-dev#11475 (comment),
currently with `--import-mode=importlib` pytest will try to import every
file by using a unique module name, regardless if that module could be
imported using the normal import mechanism without touching `sys.path`.

This has the consequence that non-test modules available in `sys.path`
(via other mechanism, such as being installed into a virtualenv,
PYTHONPATH, etc) would end up being imported as standalone modules,
instead of imported with their expected module names.

To illustrate:

```
.env/
  lib/
    site-packages/
      anndata/
        core.py
```

Given `anndata` is installed into the virtual environment, `python -c
"import anndata.core"` works, but pytest with `importlib` mode would
import that module as a standalone module named
`".env.lib.site-packages.anndata.core"`, because importlib module was
designed to import test files which are not reachable from `sys.path`,
but now it is clear that normal modules should be imported using the
standard mechanisms if possible.

Now `imporlib` mode will first try to import the module normally,
without changing `sys.path`, and if that fails it falls back to
importing the module as a standalone module.

This also makes `importlib` respect namespace packages.

This supersedes pytest-dev#11931.

Fix pytest-dev#11475
Close pytest-dev#11931
nicoddemus added a commit to nicoddemus/pytest that referenced this issue Mar 2, 2024
nicoddemus added a commit to nicoddemus/pytest that referenced this issue Mar 2, 2024
…chanism

As detailed in
pytest-dev#11475 (comment),
currently with `--import-mode=importlib` pytest will try to import every
file by using a unique module name, regardless if that module could be
imported using the normal import mechanism without touching `sys.path`.

This has the consequence that non-test modules available in `sys.path`
(via other mechanism, such as being installed into a virtualenv,
PYTHONPATH, etc) would end up being imported as standalone modules,
instead of imported with their expected module names.

To illustrate:

```
.env/
  lib/
    site-packages/
      anndata/
        core.py
```

Given `anndata` is installed into the virtual environment, `python -c
"import anndata.core"` works, but pytest with `importlib` mode would
import that module as a standalone module named
`".env.lib.site-packages.anndata.core"`, because importlib module was
designed to import test files which are not reachable from `sys.path`,
but now it is clear that normal modules should be imported using the
standard mechanisms if possible.

Now `imporlib` mode will first try to import the module normally,
without changing `sys.path`, and if that fails it falls back to
importing the module as a standalone module.

This also makes `importlib` respect namespace packages.

This supersedes pytest-dev#11931.

Fix pytest-dev#11475
Close pytest-dev#11931
nicoddemus added a commit to nicoddemus/pytest that referenced this issue Mar 2, 2024
@flying-sheep
Copy link
Contributor Author

flying-sheep commented Mar 4, 2024

Sadly the changes you made to the PR after I checked with my project broke it again. I should have checked again …

Somehow if running with --doctest-modules and --import-mode=importlib, now (some of the?) modules and classes imported both during doctests and unit tests have the same qualified names and import paths, but are not the same object.

So e.g.

  • pickling fails, since the class in the tests isn‘t identical to a pickled object’s class (same name same code, different object id I guess)
  • warning filters fail, since they’re installed for one version of the warning class and not the other,
  • imports fail, because one version of the import doesn’t have a field while another has

you can see some of the issues here: https://dev.azure.com/scverse/anndata/_build/results?buildId=5905&view=logs&j=5ea502cf-d418-510c-3b5f-c4ba606ae534

But yeah, I’m pretty sure that the root cause is that somehow things end up being re-imported that should be in the module cache.

Any idea what could be causing that?

I’m happy to work on a fix myself, but it would be great to have a pointer.

@bluetech
Copy link
Member

bluetech commented Mar 4, 2024

I haven't looked at your failures but IIRC you are using namespace packages, did you set the new consider_namespace_packages = true ini setting?

@nicoddemus
Copy link
Member

nicoddemus commented Mar 4, 2024

Argh, really sorry to hear that @flying-sheep. I did test the fix in your reproducible example before merging, but seems there is still something to adjust.

Looking at the code I suspect there might be a bug there indeed, seems we should be checking if module_name already is in sys.modules before attempting to import it here:

mod = _import_module_using_spec(
module_name, path, pkg_root, insert_modules=False
)

Like it is being done here:

module_name = module_name_from_path(path, root)
with contextlib.suppress(KeyError):
return sys.modules[module_name]

I will check this tonight.

Sorry again for the breakage. 😕

@flying-sheep
Copy link
Contributor Author

flying-sheep commented Mar 4, 2024

I’m super grateful that you’re working on getting this resolved.

I’m certain once this bug is fixed, this will make using pytest a blast in all kinds of (actual and not-so-)corner cases.

I’ll test the fix with our projects, which should be complex-enough real-world examples to prove that it works in a lot of cases.

E.g. one easy reproducer is

git clone https://github.com/scverse/scanpy.git
cd scanpy
git checkout 14555ba48537995acaa381b8b6ad5fc41e612510  # current main
python -m venv .venv
.venv/bin/pip install .[test] pytest==8.1.0
.venv/bin/pytest scanpy/_utils/compute/is_constant.py

nicoddemus added a commit to nicoddemus/pytest that referenced this issue Mar 4, 2024
nicoddemus added a commit that referenced this issue Mar 4, 2024
aaraney added a commit to aaraney/DMOD that referenced this issue Mar 11, 2024
christophertubbs pushed a commit to NOAA-OWP/DMOD that referenced this issue Mar 11, 2024
@twmr
Copy link
Contributor

twmr commented Mar 11, 2024

Note that the documentation of this feature mentions that the default value of consider_namespace_packages is False. However, if import_path is called without consider_namespace_packages, an exception is raised. Shouldn't the documentation be updated or was the it forgotten to specify defaults in the public api?

@nicoddemus
Copy link
Member

The documentation is talking about the consider_namespace_packages option, not the parameter to import_path, which is not part of the public API btw.

flying-sheep pushed a commit to flying-sheep/pytest that referenced this issue Apr 9, 2024
…chanism

As detailed in
pytest-dev#11475 (comment),
currently with `--import-mode=importlib` pytest will try to import every
file by using a unique module name, regardless if that module could be
imported using the normal import mechanism without touching `sys.path`.

This has the consequence that non-test modules available in `sys.path`
(via other mechanism, such as being installed into a virtualenv,
PYTHONPATH, etc) would end up being imported as standalone modules,
instead of imported with their expected module names.

To illustrate:

```
.env/
  lib/
    site-packages/
      anndata/
        core.py
```

Given `anndata` is installed into the virtual environment, `python -c
"import anndata.core"` works, but pytest with `importlib` mode would
import that module as a standalone module named
`".env.lib.site-packages.anndata.core"`, because importlib module was
designed to import test files which are not reachable from `sys.path`,
but now it is clear that normal modules should be imported using the
standard mechanisms if possible.

Now `imporlib` mode will first try to import the module normally,
without changing `sys.path`, and if that fails it falls back to
importing the module as a standalone module.

This also makes `importlib` respect namespace packages.

This supersedes pytest-dev#11931.

Fix pytest-dev#11475
Close pytest-dev#11931
flying-sheep pushed a commit to flying-sheep/pytest that referenced this issue Apr 9, 2024
flying-sheep pushed a commit to flying-sheep/pytest that referenced this issue Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: collection related to the collection phase
Projects
None yet
6 participants