Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flaky unit test: test_build_linkcheck.test_linkcheck_exclude_documents #12188

Closed
jayaddison opened this issue Mar 23, 2024 · 7 comments
Closed

Comments

@jayaddison
Copy link
Contributor

Describe the bug

This unit test failed recently during a continuous integration test job.

FAILED tests/test_builders/test_build_linkcheck.py::test_linkcheck_exclude_documents - AssertionError: assert [{'filename': 'br0ken_link.rst', 'lineno': 4, 'status': 'ignored', 'code': 0, 'uri': 'https://www.sphinx-doc.org/this-is-another-broken-link', 'info': 'br0ken_link matched br[0-9]ken_link from linkcheck_exclude_documents'}, {'filename': 'broken_link.rst', 'lineno': 4, 'status': 'ignored', 'code': 0, 'uri': 'https://www.sphinx-doc.org/this-is-a-broken-link', 'info': 'broken_link matched ^broken_link$ from linkcheck_exclude_documents'}] == [{'filename': 'broken_link.rst', 'lineno': 4, 'status': 'ignored', 'code': 0, 'uri': 'https://www.sphinx-doc.org/this-is-a-broken-link', 'info': 'broken_link matched ^broken_link$ from linkcheck_exclude_documents'}, {'filename': 'br0ken_link.rst', 'lineno': 4, 'status': 'ignored', 'code': 0, 'uri': 'https://www.sphinx-doc.org/this-is-another-broken-link', 'info': 'br0ken_link matched br[0-9]ken_link from linkcheck_exclude_documents'}]

The cause appears to be that the test expects the linkchecker to receive results in a specific order. The linkcheck builder uses threads when performing linkchecking - so the result order is not guaranteed.

How to Reproduce

To replicate this problem, we want multiple threads to spin up, and for each of them to consume a task to work on, but for the results to be emitted in unreliable order.

To achieve that, the following test-patch introduces a marginal delay after each HyperlinkAvailabilityCheckWorker thread instance picks up a task from the queue:

diff --git a/sphinx/builders/linkcheck.py b/sphinx/builders/linkcheck.py
index 83f45e471..414adc4d8 100644
--- a/sphinx/builders/linkcheck.py
+++ b/sphinx/builders/linkcheck.py
@@ -320,6 +320,9 @@ class HyperlinkAvailabilityCheckWorker(Thread):
     def run(self) -> None:
         while True:
             next_check, hyperlink = self.wqueue.get()
+            import time
+            from random import randint
+            time.sleep(0.001 * (randint(0, 100) / 100))
             if hyperlink is None:
                 # An empty hyperlink is a signal to shutdown the worker; cleanup resources here
                 self._session.close()

Provided that at least two threads are available (see config.linkcheck_workers), this should allow the test_linkcheck_exclude_documents test to fail during approximately 50% of trials.

Environment Information

Platform:              linux; (Linux-6.6.15-rt-amd64-x86_64-with-glibc2.37)
Python version:        3.11.8 (main, Feb  7 2024, 21:52:08) [GCC 13.2.0])
Python implementation: CPython
Sphinx version:        7.3.0+/d333c26dc
Docutils version:      0.20.1
Jinja2 version:        3.1.3
Pygments version:      2.17.2

Sphinx extensions

N/A

Additional context

No response

@picnixz
Copy link
Member

picnixz commented Mar 23, 2024

@jayaddison Can you create a single issue for tracking them? it's really hard for me to keep up with the issues and what remains to do.

@jayaddison
Copy link
Contributor Author

@picnixz what tools do you prefer to keep track of related issues? I quite like using GitHub issue labels to group related items. If that's suitable, we could try that - or let me know if there could be a better way for you to find them.

@picnixz
Copy link
Member

picnixz commented Mar 23, 2024

For me the best tool is one big issue with lots of checkboxes (same as #11991 [ok this one doesn't have boxes]) where you classify the issues by file/module or category.

It's much easier to navigate through that instead of labels when the issues are all related to the same ... label (I don't mind labels but I could end up with 'tests/bug/linkcheck builder' and.. this is not sufficient for filtering IMO).

Labels are usually for "long-standing" issues but for those that are like "easy", it's an overkill IMO.

@jayaddison
Copy link
Contributor Author

Brilliant - yep, I like the checkboxes too.

I think that there are two closely-related categories of problem here: non-parallelizable unit tests, and flaky unit tests. Would it be OK to put them in two separate section headings within a single issue? My concern otherwise is that we might get into a lot of confusion/noise between the two categories if they are tracked separately (still some if they are combined.. but less, I expect).

@picnixz
Copy link
Member

picnixz commented Mar 23, 2024

Would it be OK to put them in two separate section headings within a single issue

Yes, it's fine. At least we know what needs to be fixed.

@jayaddison
Copy link
Contributor Author

@jayaddison Can you create a single issue for tracking them? it's really hard for me to keep up with the issues and what remains to do.

Filed as #12191 - although please note that I opted to add an additional category during the writeup, so there are: flaky tests (reliability), parallelizable tests (performance / utilizing available resources), and independence (tests that affect each other are unreliable).

@jayaddison
Copy link
Contributor Author

Resolved by #12189.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 27, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants