New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto content trimming #353
Comments
Acknowledge seeing this issue, will take a look and keep you posted. |
@mysticmind any update on this one? Wondering if you think there is a mitigation / workaround we can use in the meantime? |
@rsuk-mb Hey, thanks for following up on this, |
I think the scenario here is as @eladrotem described - the original HTML contains significant whitespace inside the <em> or <strong> tag and by trimming it the output is no longer correct - what were previously 2 words are now one. Unfortunately this is a really common scenario in HTML editors - for example in the TinyMCE editor (https://www.tiny.cloud/docs/tinymce/6/full-featured-premium-demo), double click a word (e.g. "Welcome" in the demo) and it will select the word and the space following. Select bold and you end up with this in the html: <h2><strong>Welcome </strong>to the TinyMCE Cloud demo!</h2> Which once converted will yield: ## **Welcome**to the TinyMCE Cloud demo! The simplest solution seems to be to add an option to disable the default trimming behaviour for these tags to enable a lossless conversion. @mysticmind I am curious about the rationale for including the trim in the first place for these elements - was there a scenario where whitespace that wasn't significant in rendered HTML was impacting the markdown generation? Would love to understand that to anticipate any edge cases that would arise if the trimming was disabled. |
@rsuk-mb There is no specific rationale as I recall, it is primarily to trim up spaces around the blocks tags. But the implementation applies it is many places causing issues. I will have a look at it and try to tweak the implementation accordingly. |
@rsuk-mb I managed to add fixes to cleanup the Trims especially for inline tags. Things look good now and the use case outlined in the issue is passing now. |
Thanks for such a quick response. Unfortunately I think I have now discovered the rationale for trimming whitespace though - bold and italic markdown with spaces at the start or end is not recognised as bold - e.g. **this ** is not valid markdown :( Wondering if the solution here would be to move the trailing whitespace so that it appears outside of the markdown - e.g. "<strong>this text" -> "this text". |
@rsuk-mb Did you test this use case with 4.1.0 released now? The case what you are describing is already dealt with. |
When converting html text, the content of the html node is automatically trimmed from spaces which leads to undesired behavior.
For example, when converting the following:
(which is common since double clicking a word usually includes the following space)
the result is:
Is it possible to disable the trimming or make is optional\configurable?
Thanks!
The text was updated successfully, but these errors were encountered: