Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PrettyPrinter(minimizeEmpty = true) injects a new line into elements with attributes #231

Closed
hosamaly opened this issue May 29, 2018 · 6 comments

Comments

@hosamaly
Copy link
Contributor

hosamaly commented May 29, 2018

Using scala-xml v1.1.0, pretty-printing a leaf node with attributes when minimizeEmpty is set to true injects a new line into the element:

scala> new scala.xml.PrettyPrinter(width = 4, step = 2, minimizeEmpty = true).format(<a b="c"/>)
res0: String =
"<a
b="c">
</a>
"

scala> scala.xml.XML.loadString(res0).child.map(_.getClass.getName)
res2: scala.collection.immutable.Seq[String] = List(scala.xml.Text)

This behaviour renders the XML invalid if the schema dictates a non-mixed element.

@ashawley
Copy link
Member

Hosam,

Thanks for reporting this with an example.

Indeed, it looks like the method PrettyPrinter.traverse will prioritize the width limit regardless of the empty element setting, and break up the element to try to fit the width.

Perhaps, there could be a check of minimizeMode in the conditional on line 160 of PrettyPrinter.scala?

@hosamaly
Copy link
Contributor Author

hosamaly commented May 29, 2018

Thanks, Aaron. I think that the conditional on line 160 may cause a different side effect. Specifically, I'm worried that an element with a long label and short attributes wouldn't be broken into multiple lines.

To avoid this, I made a different change in PR #233 that should keep the behaviour of breaking the line in these cases.

@ashawley
Copy link
Member

Ok, fair enough. Indeed, enforcing the width on attributes is a requirement.

Did you see the nearbytodo comment? Seems it is trying to do some logic related to breaking spaces? Maybe it could handle attributes? I fear it is probably naive, and not useful to us, though.

Thanks for the PR and including some minimal tests.

@hosamaly
Copy link
Contributor Author

Yes, I saw the todo. It's been there for ~14 years. It isn't clear to me why it's (still) commented out, so I ignored it.

@ashawley
Copy link
Member

Why reviewing the change, it was suggested that PrettyPrinter.format should preserve the XML structure. While that is true for this defect, related to empty elements getting both a closing element and a text newline inserted, but I predict this rule will not scale generally for all outputted XML, like when there are a lot of nested tags.

@hosamaly
Copy link
Contributor Author

By definition, pretty-printing means injecting whitespaces around XML tags, changing the structure of the document.

However, empty tags (i.e. ones without child nodes) are a special case. They should remain empty because the XML schema may not allow them to contain text. As such, the safest way to pretty-print them is to always make them self-closing. They can still be formatted, and the closing slash (/>) can be on a line by its own, meeting both requirements.

ashawley added a commit that referenced this issue Jun 25, 2018
Fix #231: Minimize empty elements with lengthy attributes
@ashawley ashawley mentioned this issue Aug 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants