-
Notifications
You must be signed in to change notification settings - Fork 873
Footnote ref number in TOC #660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
So the code which sanitizes the text for use in the TOC is pretty simple. It simply pulls the text from the HTML elements. It could be significantly more complex to exclude footnote refs. And I find it odd that we would need to only do this for a non-standard add-on syntax. Additionally, the fact that this is only being reported now suggests that this is an unusual edge case that not many users will encounter. That said, it is clearly not what one would expect and should probably be fixed. Of course, pull requests are welcome. |
- All postprocessors are run on heading content (not just `RawHtmlPostprocessor`). - Footnote references are stripped from heading content. Fixes Python-Markdown#660. - A more robust `striptags` is provided to convert headings to plain text. Unlike, markupsafe's implementation, HTML entities are not unescaped. - Both the plain text `name` and rich `html` are saved to `toc_tokens`, which means users can now access the full rich text content of the headings directly from the `toc_tokens`. - `data-toc-label` is sanitized separate from heading content. - A `html.unescape` call added to `slugify` and `slugify_unicode`, which ensures `slugify` operates on Unicode characters, rather than HTML entities. By including in the functions, users can override with their own slugify functions if they desire. Note that this first commit includes minimal changes to the tests to show very little change in behavior (mostly the new `html` attribute of the `toc_tokens` was added). A refactoring of the tests will be in a separate commit.
* All postprocessors are run on heading content. * Footnote references are stripped from heading content. Fixes #660. * A more robust `striptags` is provided to convert headings to plain text. Unlike, the `markupsafe` implementation, HTML entities are not unescaped. * The plain text `name`, rich `html` and unescaped raw `data-toc-label` are saved to `toc_tokens`, allowing users to access the full rich text content of the headings directly from `toc_tokens`. * `data-toc-label` is sanitized separate from heading content. * A `html.unescape` call is made just prior to calling `slugify` so that `slugify` only operates on Unicode characters. Note that `html.unescape` is not run on the `name` or `html`. * The `get_name` and `stashedHTML2text` functions defined in the `toc` extension are both **deprecated**. Instead, use some combination of `run_postprocessors`, `render_inner_html` and `striptags`. Co-authored-by: Oleh Prypin <oleh@pryp.in>
The text was updated successfully, but these errors were encountered: