New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: make documentation reproducible #24
Conversation
Thanks, we'll check internally if we have any issue with our doc generation. Can you clarify what is the original motivation? I am a bit puzzled why is having doc deterministic across build systems (essentially, how ellipsis is rendered) important to you? |
Hi. Discussing this a bit further, we don't think that making such modifications in the document source goes in the right direction. Forbidding "doesn't" and "isn't" is certainly going too far. Can you give us a bit more context, such as, which doc formats are causing problems to you with respect to reproducibility? We can e.g. consider turning off "smartquotes" in text formats like txt/html if that helps? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not practical as suggested.
A comma before an ellipsis is common in mathematical typography but quite rare in english.
The ReStructured Text variant used by sphinxdoc recommends two consecutive backquotes for raw text. Single backquote or double quotes sometimes work, but interact poorly with each other.
Hello. I have updated and split the suggestion.
We would like all build results reproducible, for easyer comparison after two builds, but also because nowadays docs often contains executable code. The original issue with xmlada is described at sphinx-doc/sphinx#9778. One could argue that the bug is in the smartquote module, but the module is not, and probably cannot be, fully specified. Even its documentation recommends to use the proper character directly when possible. Disabling smartquotes in all formats (not only text and html) would ensure reproductibility, but break the typography. Using the proper character, via either via UTF8 sources or RST substitutions, gives both. |
While the first two bullet points on the list are consensual, this one ("insert unicode characters when smartquote fails") is really controversial, as it basically requires people to stop writing the doc in natural English and start writing it in some sort of pseudocode. This is not going to work well for us unfortunately. Is this really the only practical recourse for any project trying to use sphinx to generate its docs?! |
Unicode substitutions seem impossible in litteral blocks or code-blocks.
There seem to be no easy fix for the reproducibility issue. |
Thank you for your contribution! Sorry we couldn't find a fully satisfactory solution :( |
Can you clarify what is the original motivation? I am a bit puzzled
why is having doc deterministic across build systems (essentially,
how ellipsis is rendered) important to you?
The motivation for bit-for-bit reproducibility given the exact same
source and build system is described here.
https://en.wikipedia.org/wiki/Reproducible_builds
https://reproducible-builds.org/
|
Yes, but I want to point out that there are three conditions in the "How" part - First, the build system needs to be made entirely deterministic: transforming a given source must always create the same result. For example, the current date and time must not be recorded and output always has to be written in the same order. Second, the set of tools used to perform the build and more generally the build environment should either be recorded or pre-defined. Third, users should be given a way to recreate a close enough build environment, perform the build process, and validate that the output matches the original build. Emphasis on the second point mine. I.e. this initiative doesn't expect the builds to be reproducible when the build environment is not controlled. Whereas your patch attempts to remove the constraint on the build environment by constraining the project itself instead. It looks to me therefore not in the spirit of this initiative. |
(I assume we all agree that difference in how ellipsis is rendered is fully due to the difference of the build environment?) |
Sorry for the delay, I am not at ease with github workflows and have lost track of this discussion. I thought merged PR could not be discussed afterwards. |
Sphinx fixes typographical errors, or does not, depending on the
current locale, preventing reproducible builds.
This suggestion makes the output more deterministic.
It avoids introducing non-ASCII characters, see
#23.