Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support enriched metadata #686

Merged
merged 55 commits into from Aug 18, 2023
Merged

Conversation

brettcannon
Copy link
Member

Take raw metadata and wrap data as appropriate with more enriched representations (e.g., requirements.Requirement for Requires-Dist).

Part of #570

@brettcannon brettcannon requested a review from dstufft April 4, 2023 01:05
@brettcannon brettcannon marked this pull request as ready for review August 6, 2023 23:43
@brettcannon
Copy link
Member Author

Docs are in! pypa/packaging.python.org#1283 will let me go back in and add links to the core metadata fields that I couldn't link to.

All of CI is now passing, so does anyone want to give this a once-over?

Copy link
Member

@pradyunsg pradyunsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a quick skim, LGTM!

@brettcannon
Copy link
Member Author

@pradyunsg thanks for the skim!

I'll give @dstufft some time to look it over before I merge.

@brettcannon
Copy link
Member Author

What do we want to do about core metadata 1.1 and its fields that only existed in that version (i.e., requires, provides, and obsoletes)? Right now the fields are considered a validation failure since there just isn't a concept of those fields in the class. Should I just assign that data to the roughly corresponding core metadata 1.2 fields? Should I just not support core metadata 1.0 and 1.1 (or just 1.1)?

@dstufft
Copy link
Member

dstufft commented Aug 15, 2023

I don't think we've ever removed a field? The PEP for Metadata 1.2 or 2.1 never said that those fields were removed, just deprecated.

@dstufft
Copy link
Member

dstufft commented Aug 15, 2023

It looks like pypa/packaging.python.org#386 just missed documenting those fields, because PEP 345 doesn't document them (other than as a footnote to say they're deprecated), and AFAIK most or all existing implementations of metadata reading that attempt to be comprehensive (so not things trying to get a single field or something) support them still.

Copy link
Member

@dstufft dstufft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach looks good to me, I think it covers all the bases/invariants that I was trying to handle in my original PR, looks great!

@brettcannon
Copy link
Member Author

brettcannon commented Aug 15, 2023

I don't think we've ever removed a field? The PEP for Metadata 1.2 or 2.1 never said that those fields were removed, just deprecated.

Ah, that makes sense. I have been following the spec as "the spec", and so figured the fields were gone. I will add the missing ones in as str-only fields and that should take care of things! I'll also open an issue on packaging.python.org about the missing fields.

@brettcannon
Copy link
Member Author

Actually, Donald reported the doc bug in pypa/packaging.python.org#1107 and there's a PR at pypa/packaging.python.org#1138 waiting on Donald's review. 😅

@dstufft
Copy link
Member

dstufft commented Aug 15, 2023

Donald reported the doc bug in pypa/packaging.python.org#1107

Welp.

@brettcannon
Copy link
Member Author

I wrote a script that tried to pull a METADATA file from every project on PyPI. The issues that I saw, in order of most to least frequent, were:

  1. Using core metadata version 2.0
  2. Improper description content type
  3. Invalid extras name (people definitely want a name to specify development-only requirements)
  4. Multi-line summaries
  5. Repeated fields (in one case because they messed up their Description formatting and made it a blank line and then followed with the whole description in the body)

I need to make a change to make it more descriptive what triggered when the name is invalid (since it's reusing utils.canonicalize_name() it has a generic exception message), but otherwise I think that's the only change left!

@brettcannon
Copy link
Member Author

OK, I updated the appropriate code so that all failed field validations raise InvalidMetadata (with appropriate __cause__ values). That way you only have to worry about catching a single exception type instead of having to know exactly what code was used to perform the validation or transformation of code.

And with that, I think this PR is ready! I'm waiting for CI to finish, so hopefully I can merge this tomorrow.

@brettcannon brettcannon merged commit 61e6efb into pypa:main Aug 18, 2023
31 checks passed
@brettcannon brettcannon deleted the enriched-metadata branch August 18, 2023 21:24
kodiakhq bot pushed a commit to cloudquery/plugin-sdk-python that referenced this pull request Nov 1, 2023
This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [packaging](https://togithub.com/pypa/packaging) | minor | `==23.1` -> `==23.2` |

---

### Release Notes

<details>
<summary>pypa/packaging (packaging)</summary>

### [`v23.2`](https://togithub.com/pypa/packaging/releases/tag/23.2)

[Compare Source](https://togithub.com/pypa/packaging/compare/23.1...23.2)

#### What's Changed

-   parse_marker should consume the entire source string by [@&#8203;mwerschy](https://togithub.com/mwerschy) in [pypa/packaging#687
-   Create a Security Policy file  by [@&#8203;joycebrum](https://togithub.com/joycebrum) in [pypa/packaging#695
-   Add python 3.12 to CI by [@&#8203;mayeut](https://togithub.com/mayeut) in [pypa/packaging#689
-   Remove URL validation from requirement parsing by [@&#8203;uranusjr](https://togithub.com/uranusjr) in [pypa/packaging#684
-   Add types for packaging.version.\_Version by [@&#8203;hauntsaninja](https://togithub.com/hauntsaninja) in [pypa/packaging#665
-   Add PyPy 3.10 to CI by [@&#8203;mayeut](https://togithub.com/mayeut) in [pypa/packaging#699
-   Remove unused argument in `_manylinux._is_compatible` by [@&#8203;mayeut](https://togithub.com/mayeut) in [pypa/packaging#700
-   Canonicalize names for requirements comparison by [@&#8203;astrojuanlu](https://togithub.com/astrojuanlu) in [pypa/packaging#696
-   Add platform tag support for LoongArch by [@&#8203;loongson-zn](https://togithub.com/loongson-zn) in [pypa/packaging#693
-   Ability to install `armv7l manylinux/musllinux` wheels on `armv8l` by [@&#8203;mayeut](https://togithub.com/mayeut) in [pypa/packaging#690
-   Include CHANGELOG.rst in sdist by [@&#8203;astrojuanlu](https://togithub.com/astrojuanlu) in [pypa/packaging#704
-   Update pyupgrade to Python 3.7+ by [@&#8203;fangchenli](https://togithub.com/fangchenli) in [pypa/packaging#580
-   Fix version pattern pre-releases by [@&#8203;deathaxe](https://togithub.com/deathaxe) in [pypa/packaging#705
-   Fix typos found by codespell by [@&#8203;DimitriPapadopoulos](https://togithub.com/DimitriPapadopoulos) in [pypa/packaging#706
-   Support enriched metadata by [@&#8203;brettcannon](https://togithub.com/brettcannon) in [pypa/packaging#686
-   Correct rST syntax in CHANGELOG.rst by [@&#8203;atugushev](https://togithub.com/atugushev) in [pypa/packaging#709
-   fix: platform tag for GraalPy by [@&#8203;mayeut](https://togithub.com/mayeut) in [pypa/packaging#711
-   Document that this library uses a calendar-based versioning scheme by [@&#8203;faph](https://togithub.com/faph) in [pypa/packaging#717
-   fix: Update copyright date for docs by [@&#8203;garrypolley](https://togithub.com/garrypolley) in [pypa/packaging#713
-   Bump pip version to avoid known vulnerabilities by [@&#8203;joycebrum](https://togithub.com/joycebrum) in [pypa/packaging#720
-   Typing annotations fixed in version.py by [@&#8203;jolaf](https://togithub.com/jolaf) in [pypa/packaging#723
-   parse\_{sdist,wheel}\_filename: don't raise InvalidVersion by [@&#8203;SpecLad](https://togithub.com/SpecLad) in [pypa/packaging#721
-   Fix code blocks in CHANGELOG.md by [@&#8203;edmorley](https://togithub.com/edmorley) in [pypa/packaging#724

#### New Contributors

-   [@&#8203;mwerschy](https://togithub.com/mwerschy) made their first contribution in [pypa/packaging#687
-   [@&#8203;joycebrum](https://togithub.com/joycebrum) made their first contribution in [pypa/packaging#695
-   [@&#8203;astrojuanlu](https://togithub.com/astrojuanlu) made their first contribution in [pypa/packaging#696
-   [@&#8203;loongson-zn](https://togithub.com/loongson-zn) made their first contribution in [pypa/packaging#693
-   [@&#8203;fangchenli](https://togithub.com/fangchenli) made their first contribution in [pypa/packaging#580
-   [@&#8203;deathaxe](https://togithub.com/deathaxe) made their first contribution in [pypa/packaging#705
-   [@&#8203;DimitriPapadopoulos](https://togithub.com/DimitriPapadopoulos) made their first contribution in [pypa/packaging#706
-   [@&#8203;atugushev](https://togithub.com/atugushev) made their first contribution in [pypa/packaging#709
-   [@&#8203;faph](https://togithub.com/faph) made their first contribution in [pypa/packaging#717
-   [@&#8203;garrypolley](https://togithub.com/garrypolley) made their first contribution in [pypa/packaging#713
-   [@&#8203;jolaf](https://togithub.com/jolaf) made their first contribution in [pypa/packaging#723
-   [@&#8203;SpecLad](https://togithub.com/SpecLad) made their first contribution in [pypa/packaging#721
-   [@&#8203;edmorley](https://togithub.com/edmorley) made their first contribution in [pypa/packaging#724

**Full Changelog**: pypa/packaging@23.1...23.2

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "before 4am on the first day of the month" (UTC), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://togithub.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi4xMDkuNCIsInVwZGF0ZWRJblZlciI6IjM2LjEwOS40IiwidGFyZ2V0QnJhbmNoIjoibWFpbiJ9-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants