Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove productionlist hard-coding in translators #13326

Merged
merged 12 commits into from
Feb 13, 2025

Conversation

AA-Turner
Copy link
Member

@AA-Turner AA-Turner commented Feb 10, 2025

Purpose

The productionlist directive operates in a line-based context, creating an addnodes.productionlist container of addnodes.production nodes, with one per production in the directive. However, the full state of the abstract document tree is not included in the produced nodes, with each builder/translator implementing a different way of appending the fixed separator ::= and justifying the displayed text.

This should not happen in the writer, and hard-coding such details hampers flexibility when documenting different abstract grammars (e.g. python/cpython#127835). We should move the specific form of the .. productionlist:: directive to the logic in the directive body, and have the writers apply minimal custom logic.

Open questions

The design as in this PR is split up into several different methods (e.g. make_production, production_separator, production_definitions) for overriding downstream to implement different grammars. However, we might want to introduce a more flexible alternate directive (such as @encukou's 'grammar-snippet') instead. Is this extra complexity of several methods worth it?

The LaTeX translator still uses the macros originally introduced in 2007, I am not confident about changing them. @jfbu would you be able to suggest an approach here?

https://github.com/sphinx-doc/sphinx/blob/master/sphinx/texinputs/sphinxlatexobjects.sty#L281-L294

References

A

@AA-Turner AA-Turner added this to the 8.2.0 milestone Feb 10, 2025
@jfbu
Copy link
Contributor

jfbu commented Feb 11, 2025

The LaTeX translator still uses the macros originally introduced in 2007, I am not confident about changing them. @jfbu would you be able to suggest an approach here?

@AA-Turner I will look on how to follow up on LaTeX aspects. Unrelated: I observed in HTML (and also PDF) output that with this change there is only one space after ::= in place of two. Is this intended?

@jfbu
Copy link
Contributor

jfbu commented Feb 11, 2025

The current LaTeX code uses a table, and injects the & cell separators in paricular for the ::=, which is easily done because it is the LaTeX macro which hardcodes the ::= so it can do as it fancies.

With this change the ::= is as text in the node, I am not sure about an easy way for LaTeX writer to isolate it and be able to follow the & methods to isolate it as a cell in a table.

As a try I decided to mimic the HTML which uses <pre>. So this means in LaTeX we toggle to monospace fonts (so that space characters have same width as letters) and tell to obey spaces in input. The LaTeX writer does nothing at all. Here is the patch. I did not try to maintain vertical spacing before and after from former usage of longtable, only \bigskip for time being.

diff --git a/sphinx/texinputs/sphinxlatexobjects.sty b/sphinx/texinputs/sphinxlatexobjects.sty
index e960440ca..c62d414f6 100644
--- a/sphinx/texinputs/sphinxlatexobjects.sty
+++ b/sphinx/texinputs/sphinxlatexobjects.sty
@@ -281,16 +281,8 @@
 % Production lists
 %
 \newenvironment{productionlist}{%
-%  \def\sphinxoptional##1{{\Large[}##1{\Large]}}
-  \def\production##1##2{\\\sphinxcode{\sphinxupquote{##1}}&::=&\sphinxcode{\sphinxupquote{##2}}}%
-  \def\productioncont##1{\\& &\sphinxcode{\sphinxupquote{##1}}}%
-  \parindent=2em
-  \indent
-  \setlength{\LTpre}{0pt}%
-  \setlength{\LTpost}{0pt}%
-  \begin{longtable}[l]{lcl}
-}{%
-  \end{longtable}
+  \bigskip\obeylines\parskip\z@skip\@vobeyspaces\ttfamily
+}{\par
 }
 
 % Definition lists; requested by AMK for HOWTO documents.  Probably useful
diff --git a/sphinx/writers/latex.py b/sphinx/writers/latex.py
index 72c7e9b3b..59816c76c 100644
--- a/sphinx/writers/latex.py
+++ b/sphinx/writers/latex.py
@@ -678,20 +678,10 @@ def depart_productionlist(self, node: Element) -> None:
         self.in_production_list = False
 
     def visit_production(self, node: Element) -> None:
-        if node['tokenname']:
-            tn = node['tokenname']
-            self.body.append(self.hypertarget(f'grammar-token-{tn}'))
-            self.body.append(r'\production{%s}{' % self.encode(tn))
-        else:
-            self.body.append(r'\productioncont{')
-
-        # remove name/padding and seperator child nodes,
-        # these are handled by '\production' and '\productioncont'
-        # TODO: remove special LaTeX handling of production nodes
-        node[:] = node[2:]
+        pass
 
     def depart_production(self, node: Element) -> None:
-        self.body.append('}' + CR)
+        pass
 
     def visit_transition(self, node: Element) -> None:
         self.body.append(self.elements['transition'])

This worked well but there is annoying problem with continuation lines:

Capture d’écran 2025-02-11 à 11 04 56

The original reST being:

.. productionlist:: lineContinuation
	A: B C D \
	   E F G

The problem here is that inside the .tex file we find

\begin{productionlist}
A ::= B C D    E F G
\end{productionlist}

where there are extra spaces before the E.

But right now I discover that the "line continuation problem" exists with HTML even with current Sphinx master, unrelated to this branch. It did not show in PDF output because LaTeX collapses by default contiguous spaces. Here is a snapshot from an HTML produced from above reST using current master:

Capture d’écran 2025-02-11 à 11 18 47

Here it is with this branch:

Capture d’écran 2025-02-11 à 11 19 16

So the problem with line continuation rendering is not a disqualifier for the above proposed patch for LaTeX rendering. (I should probably try to imitate more the former vertical spacing, as per inducing some extra padding at start of lines as was induced by usage a "longtable" this is a bit more annoying and would have to go via a re-definition of the \obeylines thing, not sure if that is important, or we could use a quote environment to induce the shift).

@jfbu
Copy link
Contributor

jfbu commented Feb 11, 2025

as per inducing some extra padding at start of lines as was induced by usage a "longtable" this is a bit more annoying and would have to go via a re-definition of the \obeylines thing,

sorry, it is very easy actually I only have to use \parindent inside the code and this gives same (or very similar) horizontal shift to the right of all lines as per legacy 2007 code using longtable:

(edited)

% Production lists
%
\newenvironment{productionlist}{%
  \bigskip\obeylines\parindent10pt\parskip\z@skip\@vobeyspaces\ttfamily\indent
}{\par
}

What do you think @picnixz about this somewhat basic but working LaTeX way with above environment productionlist and the LaTeX writer doing actually nothing at all?

@picnixz
Copy link
Member

picnixz commented Feb 11, 2025

There will be an issue when having a multi-line production list. Previously, we could write something like:

.. productionlist::
   try1_stmt: "try" ":" `suite`
            : ("except" [`expression` ["as" `identifier`]] ":" `suite`)+
            : ["else" ":" `suite`]
            : ["finally" ":" `suite`]

and this would render as https://docs.python.org/3/reference/compound_stmts.html#the-try-statement with something like

try1_stmt ::=  "try" ":" suite
               ("except" [expression ["as" identifier]] ":" suite) +
               ["else" ":" suite]
               ["finally" ":" suite]

Without a longtable, we won't be able to align the multi-line declarations, unless the LaTeX writer recomputes the length to skip (we would need to compute how wide the prefix + separator are so that we can change the \parindent, but it would conflict with \obeylines, right?)

@jfbu
Copy link
Contributor

jfbu commented Feb 11, 2025

@picnixz It seems to render fine: (PDF with my patch on top of this branch)

Capture d’écran 2025-02-11 à 11 56 31

Here is how it looks currently with Sphinx master, to compare: (PDF with current master)

Capture d’écran 2025-02-11 à 12 01 12

I mentioned in previous comment about the spaces following ::= this is not LaTeX thing it shows in HTML too, it seems formerly two space tokens were there and this branch removes one of them.

But I see now a reduction in spacing around ::= specific to PDF and specific to my proposed LaTeX patch, which is explained by the disappearance of spaces present in tables between columns, and now there is no table hence no such added spaces. However the needed padding is ok to ensure alignment.

(in the above test I removed the ticks from input to avoid complaints about missing references in the build)

Without a longtable, we won't be able to align the multi-line declarations, unless the LaTeX writer recomputes the length to skip (we would need to compute how wide the prefix + separator are so that we can change the \parindent, but it would conflict with \obeylines, right?)

The spaces are there in the node, and via \@vobeyspaces they will be obeyed in output, even at start of lines (the explicit switch to monospace via \ttfamily makes the spaces same width as letters).

@jfbu
Copy link
Contributor

jfbu commented Feb 11, 2025

are so that we can change the \parindent, but it would conflict with \obeylines, right?)

\obeylines only means to convert a CR in the latex source into a \par. So it does not conflict with \parindent to the contrary it obeys it. But I had to set \parskip to 0pt to avoid extra vertical spaces in-between lines.

I tested the code is compatible with page breaks.

@picnixz
Copy link
Member

picnixz commented Feb 11, 2025

Ah yes, we're keeping the whitespaces (and this is also the reason why the continuation line had issues). Ok my bad.

I tested the code is compatible with page breaks.

I had this follow-up question about production lists spanning multiple pages but you answered it :')

jfbu added a commit to jfbu/sphinx that referenced this pull request Feb 11, 2025
@jfbu
Copy link
Contributor

jfbu commented Feb 11, 2025

I made the LaTeX support at latex_prodlist (HEAD commit, which is on top of current HEAD of the present branch; I hesitated to push directly to this PR, if at all possible). Afaict things are ok. Subtleties are documented for maintainers of the 22nd century.

  • we have commented already on continuation lines, the legacy PDF output did not have the problems of HTML output but now it does because it does about the same as the <pre> tag used by HTML output.
  • one thing perhaps to discuss is that now that ::= is rendered using monospace, the two colons look a bit separated. They do match better the other components though.

jfbu added a commit to jfbu/sphinx that referenced this pull request Feb 11, 2025
jfbu added a commit to jfbu/sphinx that referenced this pull request Feb 11, 2025
@AA-Turner
Copy link
Member Author

AA-Turner commented Feb 11, 2025

Thank you for the suggested changes to the LaTeX writer!

I observed in HTML (and also PDF) output that with this change there is only one space after ::= in place of two. Is this intended?

I'm happy to revert to two spaces, we now normalise whitespace which explains the change to ::= instead of ::= . I'm not sure what the right convention is here, though.

A

@jfbu
Copy link
Contributor

jfbu commented Feb 11, 2025

@AA-Turner I am realizing only now something problematic. Even before this PR not all links worked in PDF output (if one uses for example the tests/roots/test-productionlist/index.rst test. With my LaTeX support commit, now all links are gone! I was naively expecting the targets were created elsewhere because cursory look at HTML writer did not show me link target creations but I obviously must have missed something crucial...

@AA-Turner
Copy link
Member Author

The HTML targets (id) are I think created by self.starttag(node, 'strong', '') in visit_literal_strong.

For the texinfo writer, I had to add:

    def visit_literal_strong(self, node: Element) -> None:
        if self.in_production_list:
            for id_ in node['ids']:
                self.add_anchor(id_, node)
            return

@jfbu
Copy link
Contributor

jfbu commented Feb 11, 2025

@AA-Turner brief testing seems to indicate this commit is fine. I modelled it on your texinfo support code.

(edit: CI test failure appears to be unrelated with this PR)

It actually fixes an issue which had gone unnoticed! With the test as configured in tests/roots/test-productionlist/index.rst all hyperlinks now exist in PDF output!

@AA-Turner AA-Turner merged commit c49d925 into sphinx-doc:master Feb 13, 2025
23 checks passed
@AA-Turner AA-Turner deleted the productionlist branch February 13, 2025 00:16
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 14, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants