Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Several fixes to WHEEL metadata handling #588

Merged
merged 6 commits into from
Nov 25, 2023
Merged

Several fixes to WHEEL metadata handling #588

merged 6 commits into from
Nov 25, 2023

Conversation

bgilbert
Copy link
Contributor

The PyPA Binary Distribution Format spec says the WHEEL metadata file is "in the same basic key: value format" as the METADATA file. METADATA in turn is governed by the Core Metadata Specifications, which say:

In the absence of a precise definition, the practical standard is set by what the standard library email.parser module can parse using the compat32 policy.

wheel.bdist_wheel accordingly uses email.generator.BytesGenerator to generate WHEEL, but the CLI pack and tags commands opt to hand-implement WHEEL parsing and mutation. Their mutation functions, set_tags() and set_build_number(), append any new headers to the existing WHEEL file. Since WHEEL tends to have a trailing blank line (separating the headers from the nonexistent body), the new headers end up in the "body" and are ignored by any tool that parses WHEEL with email.parser.

In addition, both functions assume that WHEEL uses CRLF line endings, while bdist_wheel (and email.generator with the compat32 policy) specifically always writes LF line endings. This turns out okay for set_tags(), which rewrites the whole file, but set_build_number() ends up appending a CRLF-terminated line to a file with LF-terminated lines.

Fix all this by removing the hand-written parsing/mutation functions in favor of email.parser and email.generator.

In addition, fix removing the build tag with wheel pack --build-number "", and add support for wheel tags --build "" for symmetry. There's an existing test for the wheel pack case, which verifies that the original build tag is removed from the wheel filename and is not removed from the WHEEL metadata. Presumably that isn't what was intended.

Finally, remove dead code for postprocessing WHEEL line endings in bdist_wheel.

The default policies of Message and BytesGenerator always produce LF
line endings, so there's never a CRLF to match.  If there were, the
existing code would have incorrectly replaced CRLF with CR, and then
test_wheelfile_line_endings() would have failed.
Copy link

codecov bot commented Nov 25, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (fe6bb82) 72.66% compared to head (2c0a583) 72.17%.
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #588      +/-   ##
==========================================
- Coverage   72.66%   72.17%   -0.49%     
==========================================
  Files          13       13              
  Lines        1090     1071      -19     
==========================================
- Hits          792      773      -19     
  Misses        298      298              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@agronholm
Copy link
Contributor

Could you please add a note to the changelog (docs/news.rst) too?

There's an existing test for this case, which verifies that the original
build tag is removed from the wheel filename and is *not* removed from
the WHEEL metadata.  Presumably this is not what was intended.

BUILD_NUM_RE assumed that it would only match at end of file, but there's
at least a newline in the way.  Fix it, and the test.
For symmetry with `wheel pack`.  The functionality works already; we just
need to fix the command-line validation.
We're about to change pack() to emit WHEEL metadata with LF-terminated
lines and a trailing blank line.  In some test_pack() parameterizations
we expect to see the original WHEEL file from the test fixture, and in
some, a modified one.  Handle the difference by comparing WHEEL contents
parsed with email.parser, instead of byte strings.
The PyPA Binary Distribution Format spec says the WHEEL metadata file is
"in the same basic key: value format" as the METADATA file.  METADATA in
turn is governed by the Core Metadata Specifications, which say: "In the
absence of a precise definition, the practical standard is set by what the
standard library email.parser module can parse using the compat32 policy."

wheel.bdist_wheel accordingly uses email.generator.BytesGenerator to
generate WHEEL, but the CLI `pack` and `tags` commands opt to
hand-implement WHEEL parsing and mutation.  Their mutation functions,
set_tags() and set_build_number(), append any new headers to the existing
WHEEL file.  Since WHEEL tends to have a trailing blank line (separating
the headers from the nonexistent body), the new headers end up in the
"body" and are ignored by any tool that parses WHEEL with email.parser.

In addition, both functions assume that WHEEL uses CRLF line endings,
while bdist_wheel (and email.generator with the compat32 policy)
specifically always writes LF line endings.  This turns out okay for
set_tags(), which rewrites the whole file, but set_build_number() ends up
appending a CRLF-terminated line to a file with LF-terminated lines.

Fix all this by removing the hand-written parsing/mutation functions in
favor of email.parser and email.generator.
@bgilbert
Copy link
Contributor Author

Sure, done.

@agronholm agronholm merged commit f4b8e48 into pypa:main Nov 25, 2023
20 checks passed
@agronholm
Copy link
Contributor

Thanks!

@bgilbert bgilbert deleted the email branch November 25, 2023 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants