Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Complete multi-character sanitization #416

Closed
wants to merge 3 commits into from

Conversation

ACN-kck
Copy link

@ACN-kck ACN-kck commented Jun 23, 2022

Go to know by CodeQL in another project, that there is a missing multi-character sanitization.

https://github.com/whelk-io/maven-settings-xml-action/pull/144/files#diff-3d2b59189eeedc2d428ddd632e97658fe310f587f7cb63b01f9b98ffc11c0197R909

image

Verified

This commit was signed with the committer’s verified signature.
KinectTheUnknown David-Joseph Xayavong
@ACN-kck ACN-kck changed the title Complete multi-character sanitization fix: Complete multi-character sanitization Jun 23, 2022
@karfau
Copy link
Member

karfau commented Jun 24, 2022

Thank you for the report and for the PR.
I was able to verify the bug:
when parsing the following XML:
<xml><child></child <injected></xml>
xmldom produces the dom equivalent to
<xml><child /><injected /></xml>
xmldom also reports at least one error in this situation
[xmldom error] end tag name: child maybe not complete, so not trusting any xml that reports errors (by adding a custom error handler, see https://github.com/xmldom/xmldom#api-reference ) is the best workaround currently.

Sadly the PR doesn't fix that (potential security) issue, which is not documented publicly.

@ACN-kck
Copy link
Author

ACN-kck commented Jun 24, 2022

Thank you for the report and for the PR. I was able to verify the bug: when parsing the following XML: <xml><child></child <injected></xml> xmldom produces the dom equivalent to <xml><child /><injected /></xml> xmldom also reports at least one error in this situation [xmldom error] end tag name: child maybe not complete, so not trusting any xml that reports errors (by adding a custom error handler, see https://github.com/xmldom/xmldom#api-reference ) is the best workaround currently.

Sadly the PR doesn't fix that (potential security) issue, which is not documented publicly.

I reviewed my solution again which indeed does not fix the issue. I checked again how this could be resolved by creating one additional test based on your message.

With your given example:
<xml><child></child <injected></xml> should be parsed to
<xml><child/></xml>, correct?

errorHandler.error("end tag name: "+tagName+' is not complete:'+config.tagName);
end = tagStart+1+tagName.length;
}else if(tagName.match(/\s</)){
tagName = tagName.replace(/[\s<].*/,'');
tagName = tagName.replace(/[\s<].*/g,'');
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By calculating the length first, the injection could be prevented.
Snapshot test 'unclosed inner' in simple.test.js will start failing, but while looking into it I'm struggling to understand why the test should work without errors. My expectation would be that it fails with an error.

Suggested change
tagName = tagName.replace(/[\s<].*/g,'');
end = tagStart + 2 + tagName.length;
tagName = tagName.replace(/[\s<].*/g,'');

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we need to shift the index +1 to catch the closing '>'

@karfau
Copy link
Member

karfau commented Jun 24, 2022

Yes, I agree that this support for automatically closing unclosed tags is the culprit.
So the fix will be a breaking change in the semver sense.

ACN-kck added 2 commits June 24, 2022 22:41

Verified

This commit was signed with the committer’s verified signature.
KinectTheUnknown David-Joseph Xayavong

Verified

This commit was signed with the committer’s verified signature.
KinectTheUnknown David-Joseph Xayavong
@ACN-kck
Copy link
Author

ACN-kck commented Jun 24, 2022

Yes, I agree that this support for automatically closing unclosed tags is the culprit. So the fix will be a breaking change in the semver sense.

I'm not having the full view on the lib in regards to this breaking change.
Is there something I should have a closed look?

@karfau
Copy link
Member

karfau commented Oct 8, 2022

Sorry for not replying for such a long time.
I will get back to looking at this PR starting from November.

@karfau karfau self-assigned this Oct 8, 2022
@karfau karfau added needs investigation Information is missing and needs to be researched xml:not well-formed https://www.w3.org/TR/xml11/#dt-wellformed breaking change Some thing that requires a version bump due to breaking changes labels Oct 8, 2022
@karfau karfau added this to the next breaking/minor release milestone Oct 8, 2022
@mgerlach
Copy link

Would be good to have this released soon for whelk-io/maven-settings-xml-action#144

@karfau
Copy link
Member

karfau commented Oct 11, 2022

@mgerlach @ACN-kck what is the reason for not using the latest version in the linked PR?

The reason why I'm asking this is that any fx we can come up with, would also be a breaking change, so you will have to update to at least 0.9.* version.
(And currently the PR doesn't fix the issue, so it would be on me picking up the work.) Sorry, for some reason I didn't notice the commits that have been added after the conversation.

If the effort is not to high it would be awesome to have that CodeQL check as part of the xmldom pipeline.
Are you able to help me setting this up?
This could be done in a separate PR, in which case it would contribute to hacktoberfest, in case that is relevant for anybody.
And I'm going to look into the option of including it into the 0.9.0 release (I recently published a pre-release 0.9.0-beta.1 in preparation of that).

@karfau
Copy link
Member

karfau commented Oct 11, 2022

I have tried to understand the current state of the PR, but I'm not able to relate the code suggestion in the comment to the code that was already commited.

I'm also not able to push updates to your branch, I guess you didn't check the box "allow edits by maintainers"?

So it looks like I have to replicate you changes in a local branch to better understand them.

@mgerlach
Copy link

mgerlach commented Oct 11, 2022

Hi @karfau I only jumped onto this because GitHub Actions is complaining about https://github.com/whelk-io/maven-settings-xml-action still using Node.js 12. whelk-io/maven-settings-xml-action#144 would fix that (among other things). I don't know why the PR has xmldom-0.7.0 rather than 0.8.3... @ACN-kck or @zteater, can you please help @karfau?

karfau added a commit that referenced this pull request Oct 18, 2022

Verified

This commit was signed with the committer’s verified signature.
KinectTheUnknown David-Joseph Xayavong
Previously it was tested as a feature, that tags after incomplete closing tags would be part of the dom.
The code responsible for this triggered a CodeQL error suspecting that it could lead to injection of a `script` tag.

When parsing HTML there no longer an error being reported to align the behavior with the one in the browser.

BREAKING CHANGE: If your code relied on not well formed XML to be parsed and include subsequent tags, this will no longer work.

#416
karfau added a commit that referenced this pull request Oct 18, 2022

Verified

This commit was signed with the committer’s verified signature.
KinectTheUnknown David-Joseph Xayavong
Previously it was tested as a feature, that tags after incomplete
closing tags would be part of the dom. The code responsible for this
triggered a CodeQL error suspecting that it could lead to injection of a
`script` tag.

BREAKING CHANGE: It no longer reports an error when parsing HTML
containing incomplete closing tags, to align the behavior with the one
in the browser.

BREAKING CHANGE: If your code relied on not well formed XML to be parsed
and include subsequent tags, this will no longer work.

#416
@karfau karfau mentioned this pull request Oct 18, 2022
karfau added a commit that referenced this pull request Oct 18, 2022

Verified

This commit was signed with the committer’s verified signature.
KinectTheUnknown David-Joseph Xayavong
@karfau
Copy link
Member

karfau commented Oct 18, 2022

@ACN-kck @mgerlach I fixed this in a more robust way in #445, which will be part of the 0.9.0 release.
I was also able to configure CodeQL checks to the repo as part of #444

Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change Some thing that requires a version bump due to breaking changes needs investigation Information is missing and needs to be researched xml:not well-formed https://www.w3.org/TR/xml11/#dt-wellformed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants