Seemingly random CI failures of `tests/test_intl.py::test_gettext_dont_rebuild_mo` with Windows #11232

jfbu · 2023-03-10T11:02:07Z

Describe the bug

I am (edit: not) competent to understand seemingly occasional:

FAILED tests/test_intl.py::test_gettext_dont_rebuild_mo - AssertionError: assert 1 == 0
 +  where 1 = <function test_gettext_dont_rebuild_mo.<locals>.get_number_of_update_targets at 0x0000022947B2A8E0>(<SphinxTestApp buildername='dummy'>)
= 1 failed, 1787 passed, 23 skipped, 5 xfailed, 28 xpassed, 6 warnings in 271.50s (0:04:31) =
Error: Process completed with exit code 1.

which I have observed a number of times recently either on PRs or direct pushes to master.

How to Reproduce

See https://github.com/sphinx-doc/sphinx/actions/runs/4383723023/jobs/7674313285 or earlier failed test of #11224 prior to merge to master: https://github.com/sphinx-doc/sphinx/actions/runs/4354037120/jobs/7608843307

As this seems random, I have no recipe for reproducing.

Environment Information

not relevant

Sphinx extensions

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

jayaddison · 2023-05-17T18:01:17Z

Yep, I've noticed this occur a few times too. Most recently here: https://github.com/sphinx-doc/sphinx/actions/runs/5006348520/jobs/8971510673?pr=11426#step:5:1930

(taking a brief look at the relevant tests/code)

jayaddison · 2023-05-17T18:57:47Z

So:

The test uses the 'dummy' builder, that does not emit any output (great, that helps narrow down the cause)
The most recent assertion failure occurred during phase1 of the test, before any files are removed and recreated (also good in terms of narrowing the cause)
The assertion is on the number of modified/changed files according to env.get_outdated_files
There are six locations within that method that call changed.add(...) -- these are the potential source(s)
- I think that the first two of those can be ignored because the test uses the dummy builder and doesn't do anything to configure reread_always
- Another one can be ignored, again because the builder is not writing to the target directory.

The remaining items contain two mtime-related modification checks, and one OSError error handler case.

I think it's unlikely that the OSError case is the cause (although it might be better to handle it by adding the relevant files to a new, fourth output tuple entry -- unknown -- rather than bundling them into changed).

In summary:

My best guess at the moment is that we're running into a filesystem timestamp consistency issue on some of the runners:

sphinx/sphinx/builders/__init__.py

Lines 498 to 503 in d3c91f9

    
           # store time of reading, for outdated files detection 
        
           # (Some filesystems have coarse timestamp resolution; 
        
           # therefore time.time() can be older than filesystem's timestamp. 
        
           # For example, FAT32 has 2sec timestamp resolution.) 
        
           self.env.all_docs[docname] = max(time.time(), 
        
                                            path.getmtime(self.env.doc2path(docname)))

And in terms of how to fix that, for test stability purposes:

Not sure yet.

jayaddison · 2023-05-21T13:02:41Z

It's tricky without having a way to replicate the problem reliably, but I'd like to try reverting the ... = max(time.time(), path.getmtime(...)) logic (718993a) and going back to the simpler ... = time.time() logic that preceded it. I'm repeat-CI-testing that in a bundle of changes in jayaddison#3 at the moment.

I've read one or two bugs on the cPython bugtracker that seem to indicate that Windows timestamp granularity had some issues and has been improved since 718993a was introduced - and also I think it's probably relatively rare for people to be building Sphinx documentation on low-timestamp-granularity filesystems (particularly FAT32). So I'm thinking of it as a potential tradeoff between unit test stability and (perhaps) some rare edge cases.

jayaddison · 2023-05-21T15:31:38Z

It's tricky without having a way to replicate the problem reliably, but I'd like to try reverting the ... = max(time.time(), path.getmtime(...)) logic (718993a) and going back to the simpler ... = time.time() logic that preceded it. I'm repeat-CI-testing that in a bundle of changes in jayaddison#3 at the moment.

Update: reverting 718993a does not fix the problem. For a situation where that commit is reverted and yet the test still fails (after a few previous successful test results), see here: https://github.com/jayaddison/sphinx/actions/runs/5038386691/jobs/9035841524?pr=3#step:5:1931

jayaddison · 2023-05-21T16:46:31Z

I've added some debug printout code, and here is the output from it during a recent unit test run where the failure occurred:

===================
docname:  index
===================
mtime:    1684687126.3444436
newmtime: 1684686899.1642947
changed?  False

===================
docname:  bom
===================
mtime:    1684687126.3444436
newmtime: 1684686899.1642947
changed?  False

-------------------
deppath:  C:\Users\runneradmin\AppData\Local\Temp\pytest-of-runneradmin\pytest-0\builder-gettext-dont-rebuild-mo\.\xx\LC_MESSAGES\bom.mo
-------------------
mtime:    1684687126.3444436
depmtime: 1684687126.3444438
changed?  True

…s seen for sphinx-doc#11232) and run a broader matrix of Python versions

jayaddison · 2023-05-21T21:06:38Z

This is based on a small sample size so far, but switching to use integer nanosecond-precision timestamps instead of the existing floating-point second-precision timestamps may be helping.

If that remains true after a few more rounds of testing, then I'll open a fix pr for this issue, containing the revert of 718993a and the nanosecond-timestamp switchover.

…iness was seen for sphinx-doc#11232) and run a broader matrix of Python versions" This reverts commit 647df0a.

jayaddison · 2023-07-23T15:02:14Z

Because the additional informational logging added for this test failure points to the byte-order-mark (bom) file as a source of the problem, I'm wondering about these unit test code lines that rewrite that file under some circumstances.

Perhaps adding further information to indicate whether that test code logic is evaluated could confirm whether it has a causal relationship with the test failures.

jayaddison · 2023-07-23T15:50:47Z

What do you think about adding a debug print between these lines to determine whether that's a potential cause, @AA-Turner? (I've drafted a commit for that at jayaddison@5924a4d )

jayaddison · 2023-09-14T13:04:37Z

Taking another look at this currently; I'm going to repeat-run CI on my fork with 5924a4d in place to see whether that helps to track anything down.

…g only occurs on windows Ref: sphinx-doc#11232

jayaddison · 2023-09-14T13:49:13Z

This build job failed and the additional output was:

# compiling .mo file C:\Users\runneradmin\AppData\Local\Temp\pytest-of-runneradmin\pytest-0\builder-gettext-dont-rebuild-mo\xx\LC_MESSAGES\xx\LC_MESSAGES\bom.mo

I'm not sure yet whether that helps trace this down. I'm trying to refresh my context about how all this works.

jayaddison · 2023-09-14T14:32:34Z

Does anyone have any thoughts on whether we could/should mark this test as xfail on non-posix platforms and reinvestigate at a later date?

AA-Turner · 2023-09-14T21:19:42Z

We have strict xfail turned on, so that would introduce stochastic failures in the other direction, unhelpfully.

A

jayaddison · 2023-09-15T09:17:51Z

Ok - although that only changes the default, I think - so we could specify strict=False in the marker.

I'll admit to likely truth of: it's the kind of bug/issue that once hidden, might not really be fixed later (or not until 'much, much' later at least). (subtle, possibly complicated, potentially time-consuming to track down, and seemingly low-impact in terms of production usage (although arguably important for continuous integration reporting))

picnixz · 2024-02-24T12:37:54Z

Fixed in #11940

jfbu added the type:tests label Mar 10, 2023

jfbu added a commit that referenced this issue Mar 15, 2023

Dummy commit to re-launch Windows CI (instance of #11232)

780fc6b

AA-Turner added this to the some future version milestone Apr 29, 2023

jayaddison added a commit to jayaddison/sphinx that referenced this issue May 21, 2023

[testing] [ci] Focus on Windows (since that's where test flakiness wa…

647df0a

…s seen for sphinx-doc#11232) and run a broader matrix of Python versions

jayaddison added a commit to jayaddison/sphinx that referenced this issue May 21, 2023

Revert "[testing] [ci] Focus on Windows (since that's where test flak…

8c9984a

…iness was seen for sphinx-doc#11232) and run a broader matrix of Python versions" This reverts commit 647df0a.

jayaddison mentioned this issue May 21, 2023

test_intl: resolve test flakiness by using integer nanosecond timestamps during file modification detection #11435

Merged

AA-Turner closed this as completed in #11435 Jul 20, 2023

This comment was marked as resolved.

Sign in to view

AA-Turner reopened this Jul 22, 2023

AA-Turner mentioned this issue Aug 10, 2023

Always say why something is out of date #11572

Merged

jayaddison mentioned this issue Sep 1, 2023

Fix ordering inside searchindex.js not being deterministic #11665

Merged

jayaddison added a commit to jayaddison/sphinx that referenced this issue Sep 14, 2023

ci: skip non-windows tests; the flaky test failure we're investigatin…

da14f41

…g only occurs on windows Ref: sphinx-doc#11232

AA-Turner added the builder:gettext label Jan 15, 2024

picnixz closed this as completed Feb 24, 2024

github-actions bot locked as resolved and limited conversation to collaborators Mar 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Seemingly random CI failures of `tests/test_intl.py::test_gettext_dont_rebuild_mo` with Windows #11232

Seemingly random CI failures of `tests/test_intl.py::test_gettext_dont_rebuild_mo` with Windows #11232

jfbu commented Mar 10, 2023 •

edited

jayaddison commented May 17, 2023

jayaddison commented May 17, 2023

jayaddison commented May 21, 2023

jayaddison commented May 21, 2023

jayaddison commented May 21, 2023

jayaddison commented May 21, 2023

This comment was marked as resolved.

jayaddison commented Jul 23, 2023

jayaddison commented Jul 23, 2023

jayaddison commented Sep 14, 2023

jayaddison commented Sep 14, 2023

jayaddison commented Sep 14, 2023

AA-Turner commented Sep 14, 2023

jayaddison commented Sep 15, 2023

picnixz commented Feb 24, 2024

Seemingly random CI failures of tests/test_intl.py::test_gettext_dont_rebuild_mo with Windows #11232

Seemingly random CI failures of tests/test_intl.py::test_gettext_dont_rebuild_mo with Windows #11232

Comments

jfbu commented Mar 10, 2023 • edited

Describe the bug

How to Reproduce

Environment Information

Sphinx extensions

Additional context

jayaddison commented May 17, 2023

jayaddison commented May 17, 2023

jayaddison commented May 21, 2023

jayaddison commented May 21, 2023

jayaddison commented May 21, 2023

jayaddison commented May 21, 2023

This comment was marked as resolved.

jayaddison commented Jul 23, 2023

jayaddison commented Jul 23, 2023

jayaddison commented Sep 14, 2023

jayaddison commented Sep 14, 2023

jayaddison commented Sep 14, 2023

AA-Turner commented Sep 14, 2023

jayaddison commented Sep 15, 2023

picnixz commented Feb 24, 2024

Seemingly random CI failures of `tests/test_intl.py::test_gettext_dont_rebuild_mo` with Windows #11232

Seemingly random CI failures of `tests/test_intl.py::test_gettext_dont_rebuild_mo` with Windows #11232

jfbu commented Mar 10, 2023 •

edited