Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test failures in predeps job (pre-release dependencies) #15807

Closed
pllim opened this issue Jan 3, 2024 · 6 comments · Fixed by #15809 or #15818
Closed

Test failures in predeps job (pre-release dependencies) #15807

pllim opened this issue Jan 3, 2024 · 6 comments · Fixed by #15809 or #15818

Comments

@pllim
Copy link
Member

pllim commented Jan 3, 2024

Example log: https://github.com/astropy/astropy/actions/runs/7392979516/job/20112234044

Things in RC:

  • pandas 2.2.0rc0
  • scipy 1.12.0rc1
  • pytest 8.0.0rc1

Since we test against scipy-dev and pytest-dev (in different jobs though), likely culprit is pandas, but we might have also overlooked something that snuck in any of the dev during the holidays.

  1. io/fits/tests/test_util.py:27 with Failed: DID NOT WARN.
  2. table/tests/test_mixin.py:170 with AstropyUserWarning: Earth Location "TOPOCENTER" for Time Column "atcb" is incompatible with scale "TCB". (2 similar failures)
  3. timeseries/io/tests/test_kepler.py:107 with UnitsWarning: 'BJD - 2457000, days' did not parse as fits unit
  4. timeseries/tests/test_sampled.py:399 with UnitsWarning: 'BJD - 2457000, days' did not parse as fits unit
  5. utils/tests/test_data.py:1101 with CacheMissingWarning: Remote data cache could not be accessed due to OSError

Some of these also look suspicious. I wonder if there was a remote network failure and these are transient. I restarted the job and all the failures remain. Whoever wants to investigate will have to install pre-release(s) of suspected package(s) and see if you can reproduce the errors.

@neutrinoceros
Copy link
Contributor

neutrinoceros commented Jan 4, 2024

I think most, if not all of these are caused by an intentional change in pytest 8.0.0rc1 . It's listed in https://docs.pytest.org/en/latest/changelog.html#other-breaking-changes, and I'll quote the relevant line for reference:

#9288: warns() now re-emits unmatched warnings when the context closes – previously it would consume all warnings, hiding those that were not matched by the function.

In other words, tests where pytest.warns was used to capture a single warning pass on pytest < 8 as long as the matched warning is emitted, but any other warnings are silently swallowed. Since we treat warnings as errors, pytest 8 re-emmitting previously hidden warnings results in new failures. What to do about these tests should probably be discussed on a case-by-case basis:

  1. should other warnings be matched as well ?
  2. should their emission be considered buggy ?

I don't have required expertise to guess what's preferable in all cases but I'm going to go with option 1, which is less work, and open a PR to start the discussions.

@neutrinoceros
Copy link
Contributor

Ah actually I'm able to reproduce this error by running the exact tox command locally (tox -v -e py311-test-alldeps-predeps -- --remote-data=any --lf) though the test is still passing on my default dev env (which has pytest 8.0.0rc1).
I've been stretching my head over it for the better part of an hour but so far I didn't find the significant difference between my two envs (there are about 100 pypi packages in each so differences are legions). Something interesting is that the failing test itself only relies on stdlib and pytest so, it would seem natural to suspect that pytest or some plugin is responsible for the change of behaviour.
Despite their many differences, my two envs have the exact same versions for pytest itself and all pytest plugins (including hypothesis); namely

hypothesis==6.92.2
pytest==8.0.0rc1
pytest-arraydiff==0.6.1
pytest-astropy==0.11.0
pytest-astropy-header==0.2.2
pytest-cov==4.1.0
pytest-doctestplus==1.1.0
pytest-filter-subpackage==0.1.2
pytest-mock==3.12.0
pytest-remotedata==0.4.1
pytest-xdist==3.5.0

so right now I'm out of ideas. For now I'll leave it at rest, hoping we (I ?) can think of a smart way to systematically look for the relevant difference, otherwise it might take a while.

@pllim
Copy link
Member Author

pllim commented Jan 5, 2024

I am guessing you are talking about the remaining failure in #15809 in astropy/io/fits/tests/test_util.py w.r.t. pytest.raises(KeyboardInterrupt, test)? I will have a look when I get a chance. Thanks for the update!

@neutrinoceros
Copy link
Contributor

That's right ! Sorry I meant to comment on the PR but apparently got mixed up.

@saimn
Copy link
Contributor

saimn commented Jan 8, 2024

About the test_ignore_sigint issue, it happens when the fsspec tests are run before (so only with optional deps and -remote-data) because ignore_sigint captures the interrupt signal only when there is only one thread. And fsspec creates 2 threads that remains active after the tests are run:

astropy/io/fits/tests/test_util.py::TestUtils::test_ignore_sigint current: <_MainThread(MainThread, started 139810558670656)>
single: False
active: 3
[<_MainThread(MainThread, started 139810558670656)>, <Thread(fsspecIO, started daemon 139810347914944)>, <Thread(asyncio_0, started daemon 139810389862080)>]
FAILED

The easy fix would be to run the test when there is only one thread.
We could also consider removing ignore_sigint, it seems weird that this is used only when updating a file and not writing a new one, but I don't know why it was added initially (a long time ago, spacetelescope/PyFITS@1b674c7).

@pllim
Copy link
Member Author

pllim commented Jan 8, 2024

Made by @chanley 17 years ago... I wonder if he remembers why.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment