Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sanitizer does not remove comment but converts it to plain html #470

Open
Sicos1977 opened this issue Sep 27, 2023 · 6 comments
Open

Sanitizer does not remove comment but converts it to plain html #470

Sicos1977 opened this issue Sep 27, 2023 · 6 comments

Comments

@Sicos1977
Copy link

Sicos1977 commented Sep 27, 2023

I use the latest version from nuget (not a beta version). When sanitizing the attached HTML it does not remove the comment that is between the javascript tags but for some reason it is converted to plain html.

image

comment.zip

@mganss
Copy link
Owner

mganss commented Sep 27, 2023

What is your configuration? The HTML comment syntax used inside a script element does not create HTML comments but they become part of the script's text.

@Sicos1977
Copy link
Author

What do you mean with configuration? I don't understand that question.

The HTML is coming from an e-mail that is sent to us from a customer. We convert that e-mail to PDF but sanitize it before doing so.

@mganss
Copy link
Owner

mganss commented Sep 27, 2023

Sorry, I should have been more clear. By configuration I mean how have you initialized the HtmlSanitizer object, which elements have you allowed in AllowedTags etc.

@Sicos1977
Copy link
Author

Sicos1977 commented Sep 28, 2023

This is the code --> https://github.com/Sicos1977/ChromiumHtmlToPdf/blob/master/ChromiumHtmlToPdfLib/Helpers/DocumentHelper.cs it starts at line 189 and this are the settings.

Sorry for the Dutch comments.

a minus sign means first remove everything and then add the rows below the sign
an asterix ( * ) means use default settings and the lines after it means add those to the default settings

image

@mganss
Copy link
Owner

mganss commented Sep 28, 2023

I can't reproduce. AFAICT you are using HtmlSanitizer in the default configuration (default allowed tags, attributes etc). In that configuration, the script tag is disallowed and should be removed (including its content).
Can you provide a minimal example that shows the issue?

@Sicos1977
Copy link
Author

Sorry for the late response, I got side tracked by other things so I have to look into this again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants