Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using emojis with blog plugin causes crash #5555

Closed
4 tasks done
perpil opened this issue May 23, 2023 · 2 comments
Closed
4 tasks done

Using emojis with blog plugin causes crash #5555

perpil opened this issue May 23, 2023 · 2 comments
Labels
bug Issue reports a bug resolved Issue is resolved, yet unreleased if open

Comments

@perpil
Copy link
Sponsor Contributor

perpil commented May 23, 2023

Context

Including emojis in blog content (in my case 馃捇) causes crashes during serve and build

Bug description

If you include an emoji character like 馃捇 anywhere in your blog content it crashes with:

lxml.etree.XMLSyntaxError: Char 0x0 out of allowed range, line 1, column 2

Using emojis on main site pages work fine. I tried to workaround it by using the pymdownx.emoji plugin, but I need the emoji in a code fence, and it wasn't replacing emojis in the code fence (likely by design).

Full trace:

INFO     -  DeprecationWarning: pkg_resources is deprecated as an API
              File
            "/Users/david/Documents/GitHub/mkdocs-material/material/plugins/info/plugin.py",
            line 33, in <module>
                from pkg_resources import get_distribution, working_set
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 121, in <module>
                warnings.warn("pkg_resources is deprecated as an API",
            DeprecationWarning)
INFO     -  DeprecationWarning: Deprecated call to
            `pkg_resources.declare_namespace('google')`.
            Implementing implicit namespace packages (as specified in PEP 420)
            is preferred to `pkg_resources.declare_namespace`. See
            https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2870, in activate
                declare_namespace(pkg)
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2338, in declare_namespace
                warnings.warn(msg, DeprecationWarning, stacklevel=2)
INFO     -  DeprecationWarning: Deprecated call to
            `pkg_resources.declare_namespace('google.logging')`.
            Implementing implicit namespace packages (as specified in PEP 420)
            is preferred to `pkg_resources.declare_namespace`. See
            https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2870, in activate
                declare_namespace(pkg)
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2338, in declare_namespace
                warnings.warn(msg, DeprecationWarning, stacklevel=2)
INFO     -  DeprecationWarning: Deprecated call to
            `pkg_resources.declare_namespace('mpl_toolkits')`.
            Implementing implicit namespace packages (as specified in PEP 420)
            is preferred to `pkg_resources.declare_namespace`. See
            https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2870, in activate
                declare_namespace(pkg)
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2338, in declare_namespace
                warnings.warn(msg, DeprecationWarning, stacklevel=2)
INFO     -  DeprecationWarning: Deprecated call to
            `pkg_resources.declare_namespace('ruamel')`.
            Implementing implicit namespace packages (as specified in PEP 420)
            is preferred to `pkg_resources.declare_namespace`. See
            https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2870, in activate
                declare_namespace(pkg)
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2338, in declare_namespace
                warnings.warn(msg, DeprecationWarning, stacklevel=2)
INFO     -  Building documentation...
INFO     -  Cleaning site directory
INFO     -  The following pages exist in the docs directory, but are not
            included in the "nav" configuration:
              - index.md
ERROR    -  Error reading page 'blog/posts/hello-world.md': Document is empty
Traceback (most recent call last):
  File "/opt/homebrew/lib/python3.10/site-packages/pyquery/pyquery.py", line 59, in fromstring
    result = getattr(etree, meth)(context)
  File "src/lxml/etree.pyx", line 3257, in lxml.etree.fromstring
  File "src/lxml/parser.pxi", line 1916, in lxml.etree._parseMemoryDocument
  File "src/lxml/parser.pxi", line 1796, in lxml.etree._parseDoc
  File "src/lxml/parser.pxi", line 1085, in lxml.etree._BaseParser._parseUnicodeDoc
  File "src/lxml/parser.pxi", line 618, in lxml.etree._ParserContext._handleParseResultDoc
  File "src/lxml/parser.pxi", line 728, in lxml.etree._handleParseResult
  File "src/lxml/parser.pxi", line 657, in lxml.etree._raiseParseError
  File "<string>", line 1
lxml.etree.XMLSyntaxError: Char 0x0 out of allowed range, line 1, column 2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/homebrew/bin/mkdocs", line 8, in <module>
    sys.exit(cli())
  File "/opt/homebrew/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/opt/homebrew/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/opt/homebrew/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/homebrew/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/homebrew/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/opt/homebrew/lib/python3.10/site-packages/mkdocs/__main__.py", line 234, in serve_command
    serve.serve(dev_addr=dev_addr, livereload=livereload, watch=watch, **kwargs)
  File "/opt/homebrew/lib/python3.10/site-packages/mkdocs/commands/serve.py", line 83, in serve
    builder(config)
  File "/opt/homebrew/lib/python3.10/site-packages/mkdocs/commands/serve.py", line 76, in builder
    build(config, live_server=live_server, dirty=dirty)
  File "/opt/homebrew/lib/python3.10/site-packages/mkdocs/commands/build.py", line 308, in build
    _populate_page(file.page, config, files, dirty)
  File "/opt/homebrew/lib/python3.10/site-packages/mkdocs/commands/build.py", line 177, in _populate_page
    page.markdown = config.plugins.run_event(
  File "/opt/homebrew/lib/python3.10/site-packages/mkdocs/plugins.py", line 520, in run_event
    result = method(item, **kwargs)
  File "/Users/david/Documents/GitHub/mkdocs-material/material/plugins/blog/plugin.py", line 357, in on_page_markdown
    read = readtime.of_markdown(markdown, rate)
  File "/opt/homebrew/lib/python3.10/site-packages/readtime/api.py", line 40, in of_markdown
    return utils.read_time(markdown, format='markdown', wpm=wpm)
  File "/opt/homebrew/lib/python3.10/site-packages/readtime/utils.py", line 48, in read_time
    el = pq(html)
  File "/opt/homebrew/lib/python3.10/site-packages/pyquery/pyquery.py", line 212, in __init__
    elements = fromstring(context, self.parser)
  File "/opt/homebrew/lib/python3.10/site-packages/pyquery/pyquery.py", line 63, in fromstring
    result = getattr(lxml.html, meth)(context)
  File "/opt/homebrew/lib/python3.10/site-packages/lxml/html/__init__.py", line 873, in fromstring
    doc = document_fromstring(html, parser=parser, base_url=base_url, **kw)
  File "/opt/homebrew/lib/python3.10/site-packages/lxml/html/__init__.py", line 761, in document_fromstring
    raise etree.ParserError(
lxml.etree.ParserError: Document is empty

Related links

Reproduction

example.zip

Steps to reproduce

  1. mkdocs build

Note that if you delete the file docs/blog/posts/hello-world.md and build again it works. index.md also contains an emoji: 馃捇

Browser

No response

Before submitting

@squidfunk squidfunk added needs investigation Issue must be investigated by the maintainers bug Issue reports a bug and removed needs investigation Issue must be investigated by the maintainers labels May 24, 2023
@squidfunk
Copy link
Owner

Thanks for reporting! This is actually a bug in the readtime library which we use for computation of reading time:

2d35d0943 mitigates the problem until this is fixed.

@squidfunk squidfunk added the resolved Issue is resolved, yet unreleased if open label May 26, 2023
@squidfunk
Copy link
Owner

Released as part of 9.1.15+insiders.4.35.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue reports a bug resolved Issue is resolved, yet unreleased if open
Projects
None yet
Development

No branches or pull requests

2 participants