Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

admonition extension returns **invalid** html code without blank line. #1329

Closed
lmende opened this issue Mar 23, 2023 · 11 comments · Fixed by #1380
Closed

admonition extension returns **invalid** html code without blank line. #1329

lmende opened this issue Mar 23, 2023 · 11 comments · Fixed by #1380
Assignees
Labels
bug Bug report. confirmed Confirmed bug report or approved feature request. extension Related to one or more of the included extensions.

Comments

@lmende
Copy link

lmende commented Mar 23, 2023

In [8]: import markdown
In [9]: some_text = """
   ...: !!! danger "Danger"
   ...:     don't try this at home.
   ...:
   ...: """
In [10]: markdown.markdown(some_text, tab_length=2, extensions=['admonition'])
Out[10]: '<div class="admonition danger">\n<p class="admonition-title">Danger<p>don\'t try this at home.</p>\n</p>\n</div>'
  1. A paragraph inside a paragraph is invalid html and etree parser complains!
  2. the simple solution of turning the title paragraph into a div will work, but it wraps the inner paragraph, resulting in a CSS mess.

Recommended working solution:

     CLASSNAME_TITLE = 'admonition-title'
+    CLASSNAME_BODY = 'admonition-body'
     RE = re.compile(r'(?:^|\n)!!! ?([\w\-]+(?: +[\w\-]+)*)(?: +"(.*?)")? *(?:\n|$)')

@@ -130,9 +134,11 @@ class AdmonitionProcessor(BlockProcessor):
             div = etree.SubElement(parent, 'div')
             div.set('class', '{} {}'.format(self.CLASSNAME, klass))
             if title:
-                p = etree.SubElement(div, 'p')
+                p = etree.SubElement(div, 'div')
                 p.text = title
                 p.set('class', self.CLASSNAME_TITLE)
+                p = etree.SubElement(div, 'div')
+                p.set('class', self.CLASSNAME_BODY)

Note: the parse_content() method of your admonition extension is an unreadable code-mess ;-)

@waylan
Copy link
Member

waylan commented Mar 23, 2023

Your proposed change alters the HTML output. Note that the HTML output is defined by what docutils outputs for admonitions. We are simply matching their output. In other words, as docutils outputs the title in a p tag, so must we. Therefore, any change to the output which changes that is not acceptable.

I'm assuming the issue it related to the the tab length. Presumably somewhere within the extension tab length is assumed to always be 4. We should instead be pulling the value in from the Markdown class instance. A brief look at the code reveals that we are already, but maybe it is being missed in one obscure location. This will take some digging in to find.

This is why we have been working on blocks which don't rely on indentation. You might want to try block based admonitions. They will not suffer from this issue.

@facelessuser
Copy link
Collaborator

The element doesn't need to be changed, the content is indented and should be seen as an indented code block, but somehow it is included in the summary. We need to ensure it doesn't get inserted into the summary. We'll have to take a look for a way to handle this properly, but I don't think your suggestion is the way to go.

As a workaround, if you intend to use an indented code block, put an empty line between admonition headers and the block, and if you did not intend to have an indented code block, ensure the paragraph is not indented as such.

@facelessuser
Copy link
Collaborator

Further, tab_length has nothing to do with this. This behavior is seen with default tab length as well if you indent the first paragraph 8 spaces after the admonition header.

@waylan
Copy link
Member

waylan commented Mar 23, 2023

So if I'm understanding correctly, if there is no blank line between the opening line (!!!) and the body and the first line of the body is indented two levels, then the first block in the body is inserted as a child p of the title. However, if either a blank line is added or the indentation is reduced to a single level, then the error does not occur.

I think I know what the problem is. It is related to the workarounds for the pesky block-parser, which splits everything into blocks by blank lines. That means that the entire body of an admonition is not fed in all together (each line-separated-block is fed in separately). Therefore, the code in parse_content which @lmende didn't understand is trying to determine if the current block is a child of the admonition or not. It does so by checking if the last child of the parent is an admonition. If so and the current block in indented, then it is assumed that the current block is also a part of the admonition. The problem occurs when there is no blank line. Everything after the opening line/title is broken off for future parsing. Apparently we are not handling indention properly here. I suspect we are prematurely dedenting. However, I have no idea why the content is being inserted as a child of the title.

@facelessuser
Copy link
Collaborator

It certainly needs investigation. I'm fairly certain it can be reasonably fixed, but I'd have to dig in and understand the conditions that cause it.

@waylan waylan added bug Bug report. extension Related to one or more of the included extensions. confirmed Confirmed bug report or approved feature request. labels Mar 23, 2023
@waylan waylan changed the title admonition extension returns **invalid** html code if tab_length=2 admonition extension returns **invalid** html code without blank line. Apr 18, 2023
@ferdnyc
Copy link
Contributor

ferdnyc commented Aug 26, 2023

@facelessuser

Further, tab_length has nothing to do with this. This behavior is seen with default tab length as well if you indent the first paragraph 8 spaces after the admonition header.

That doesn't seem to be entirely true — consider the following differences:

>>> text = """
... !!! danger "This is not"
...     one long admonition title
... """
>>> print(markdown.markdown(text, extensions=['admonition']))
<div class="admonition danger">
<p class="admonition-title">This is not</p>
<p>one long admonition title</p>
</div>
>>> print(markdown.markdown(text, tab_length=2, extensions=['admonition']))
<div class="admonition danger">
<p class="admonition-title">This is not<p>one long admonition title</p>
</p>
</div>

It seems to me that tab_length has everything to do with it. The blank line is what actually has nothing to do with it — it screws up even when it's present. Observe:

>>> text = """
... !!! danger "This is not"
... 
...     one long admonition title
... """
>>> print(markdown.markdown(text, extensions=['admonition']))
<div class="admonition danger">
<p class="admonition-title">This is not</p>
<p>one long admonition title</p>
</div>
>>> print(markdown.markdown(text, tab_length=2, extensions=['admonition']))
<div class="admonition danger">
<p class="admonition-title">This is not<p>one long admonition title</p>
</p>
</div>

@facelessuser
Copy link
Collaborator

@ferdnyc, I don't think you understand my statement. Using tab_length 2 isn't the problem, it is indenting the first line more than tab length (whatever that may be). The actual tab length doesn't matter.

If you are using tab length 2, you should only indent your admonition content by 2. Indenting the first line in this case by 4 should trigger a code block, but it doesn't.

If you use the default tab length of 4, you should indent your admonition content by 4. Indenting the first line in this case by 8 should trigger a code block, but it doesn't.

>>> import markdown
>>> md = """
... !!! danger "This is not"
...   one long admonition title
... """
>>> print(markdown.markdown(md, tab_length=2, extensions=['admonition']))
<div class="admonition danger">
<p class="admonition-title">This is not</p>
<p>one long admonition title</p>
</div>
>>> md = """
... !!! danger "This is not"
...     one long admonition title
... """
>>> print(markdown.markdown(md, tab_length=2, extensions=['admonition']))
<div class="admonition danger">
<p class="admonition-title">This is not<p>one long admonition title</p>
</p>
</div>
>>> md = """
... !!! danger "This is not"
...     one long admonition title
... """
>>> print(markdown.markdown(md, tab_length=4, extensions=['admonition']))
<div class="admonition danger">
<p class="admonition-title">This is not</p>
<p>one long admonition title</p>
</div>
>>> md = """
... !!! danger "This is not"
...         one long admonition title
... """
>>> print(markdown.markdown(md, tab_length=4, extensions=['admonition']))
<div class="admonition danger">
<p class="admonition-title">This is not<p>one long admonition title</p>
</p>
</div>

So, my statement is that what you set the tab length to doesn't matter, it is the handling of the first line indent. Nothing I've seen has proven my statement false. With that said, I haven't had a chance to root cause this issue yet.

@ferdnyc
Copy link
Contributor

ferdnyc commented Aug 26, 2023

@facelessuser

Fair enough. Although, based on the tests I showed, this workaround... doesn't?

As a workaround, if you intend to use an indented code block, put an empty line between admonition headers and the block, and if you did not intend to have an indented code block, ensure the paragraph is not indented as such.

@facelessuser
Copy link
Collaborator

It's been long enough that I don't recall if I tried out the workaround, or assumed the workaround should work, but you are correct that a new line does not fix the issue. It seems using fenced code blocks does work, or at least when using SuperFences (Python Markdown's fenced_code does not support being nested under other constructs like this).

@facelessuser
Copy link
Collaborator

I should also mention that the new Pymdown Extension's Admontion does not have this issue. A new line between the header is required if indenting as indenting directly under the header is how you specify YAML options

>>> import markdown
>>> md = """
... /// danger | This is not 
... one long admonition title
... ///
... """
>>> print(markdown.markdown(md, tab_length=2, extensions=['pymdownx.blocks.admonition']))
<div class="admonition danger">
<p class="admonition-title">This is not</p>
<p>one long admonition title</p>
</div>
>>> md = """
... /// danger | This is not 
... 
...     one long admonition title
... ///
... """
>>> print(markdown.markdown(md, tab_length=2, extensions=['pymdownx.blocks.admonition']))
<div class="admonition danger">
<p class="admonition-title">This is not</p>
<pre><code>  one long admonition title
</code></pre>
</div>

With that said, this has been a reminder that I should try and look into this issue here.

@facelessuser
Copy link
Collaborator

I think I have this solved, a fix should be coming shortly.

facelessuser added a commit to facelessuser/markdown that referenced this issue Sep 2, 2023
Fix a corner case in admonitions where if an indented code block was
provided as the first block, the output would be malformed.

Fixes Python-Markdown#1329
@facelessuser facelessuser self-assigned this Sep 2, 2023
waylan pushed a commit that referenced this issue Sep 5, 2023
Fix a corner case in admonitions where if an indented code block was
provided as the first block, the output would be malformed.

Fixes #1329.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug report. confirmed Confirmed bug report or approved feature request. extension Related to one or more of the included extensions.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants
@waylan @ferdnyc @facelessuser @lmende and others