Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto content trimming #353

Closed
eladrotem opened this issue Jul 16, 2023 · 9 comments
Closed

Auto content trimming #353

eladrotem opened this issue Jul 16, 2023 · 9 comments
Labels
Milestone

Comments

@eladrotem
Copy link

When converting html text, the content of the html node is automatically trimmed from spaces which leads to undesired behavior.
image

For example, when converting the following:

... example html <i>code </i>block

(which is common since double clicking a word usually includes the following space)

the result is:

... example html codeblock

Is it possible to disable the trimming or make is optional\configurable?
Thanks!

@mysticmind
Copy link
Owner

Acknowledge seeing this issue, will take a look and keep you posted.

@mysticmind mysticmind added the bug label Oct 7, 2023
@rsuk-mb
Copy link

rsuk-mb commented Jan 3, 2024

@mysticmind any update on this one? Wondering if you think there is a mitigation / workaround we can use in the meantime?

@mysticmind
Copy link
Owner

mysticmind commented Jan 3, 2024

@rsuk-mb Hey, thanks for following up on this, Trim is used in lot many places and this will need to be looked at in a lot of detail to handle. If you have any specific use cases where this is causing larger problems for you, please list it out, will try to handle those first.

@rsuk-mb
Copy link

rsuk-mb commented Jan 3, 2024

I think the scenario here is as @eladrotem described - the original HTML contains significant whitespace inside the <em> or <strong> tag and by trimming it the output is no longer correct - what were previously 2 words are now one.

Unfortunately this is a really common scenario in HTML editors - for example in the TinyMCE editor (https://www.tiny.cloud/docs/tinymce/6/full-featured-premium-demo), double click a word (e.g. "Welcome" in the demo) and it will select the word and the space following. Select bold and you end up with this in the html:

<h2><strong>Welcome </strong>to the TinyMCE Cloud demo!</h2>

Which once converted will yield:

## **Welcome**to the TinyMCE Cloud demo!

The simplest solution seems to be to add an option to disable the default trimming behaviour for these tags to enable a lossless conversion.

@mysticmind I am curious about the rationale for including the trim in the first place for these elements - was there a scenario where whitespace that wasn't significant in rendered HTML was impacting the markdown generation? Would love to understand that to anticipate any edge cases that would arise if the trimming was disabled.

@mysticmind
Copy link
Owner

@rsuk-mb There is no specific rationale as I recall, it is primarily to trim up spaces around the blocks tags. But the implementation applies it is many places causing issues. I will have a look at it and try to tweak the implementation accordingly.

@mysticmind
Copy link
Owner

@rsuk-mb I managed to add fixes to cleanup the Trims especially for inline tags. Things look good now and the use case outlined in the issue is passing now.

@mysticmind mysticmind added this to the 4.1.0 milestone Jan 3, 2024
@rsuk-mb
Copy link

rsuk-mb commented Jan 3, 2024

Thanks for such a quick response. Unfortunately I think I have now discovered the rationale for trimming whitespace though - bold and italic markdown with spaces at the start or end is not recognised as bold - e.g. **this ** is not valid markdown :( Wondering if the solution here would be to move the trailing whitespace so that it appears outside of the markdown - e.g. "<strong>this text" -> "this text".

@mysticmind
Copy link
Owner

@mysticmind
Copy link
Owner

mysticmind commented Jan 3, 2024

@rsuk-mb Did you test this use case with 4.1.0 released now? The case what you are describing is already dealt with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants