Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial support for license expression (PEP 639) #4706

Merged
merged 7 commits into from
Feb 17, 2025

Conversation

cdce8p
Copy link
Contributor

@cdce8p cdce8p commented Oct 28, 2024

Summary of changes

 [project]
 name = "spam"
 ...
-license = {text = "MIT"}
+license = "MIT"

If present, the normalized license expression is written to the License field until metadata version 2.4 is supported.

Refs: #4629

@cdce8p cdce8p force-pushed the license-expression branch from 4c8ccfb to 1237a7d Compare October 28, 2024 05:17
@cdce8p cdce8p marked this pull request as draft October 28, 2024 05:58
@cdce8p cdce8p force-pushed the license-expression branch from 1237a7d to 93d0c67 Compare October 28, 2024 12:35
@cdce8p cdce8p marked this pull request as ready for review October 28, 2024 12:58
@abravalheri
Copy link
Contributor

Hi @cdce8p thank you very much for having a look on this.

We probably have to organise so that PEP 639 lands after PEP 685 implementation (ideally we want a couple of days/weeks between the two so that users have time to report potential bugs).

@cdce8p cdce8p force-pushed the license-expression branch from 93d0c67 to 390b95f Compare October 31, 2024 05:35
@cdce8p
Copy link
Contributor Author

cdce8p commented Nov 5, 2024

We probably have to organise so that PEP 639 lands after PEP 685 implementation (ideally we want a couple of days/weeks between the two so that users have time to report potential bugs).

Maybe I'm missing something obvious but I'm not quite sure how PEP 685 is related. Would you mind explaining it a bit more?

The way I've designed the PR, it shouldn't depend on anything. It just allows users to specify license = MIT in pyproject.toml which will be used for the License-Expression field. Note, the metadata version is still 2.2 (not 2.4 which is the first to "require" support for PEP 639).

@cdce8p
Copy link
Contributor Author

cdce8p commented Nov 6, 2024

Opened #4734 with just the validate-pyproject update. This will make the changes here easier to review.

@abravalheri
Copy link
Contributor

Maybe I'm missing something obvious but I'm not quite sure how PEP 685 is related. Would you mind explaining it a bit more?

The problem is that the metadata version of the core metadata is incremental:

  • PEP 685 stablishes 2.2
  • PEP 639 stablishes 2.3

So we cannot implement PEP 639 before 685, otherwise metadata validation may fail.

Even if the PR is independent we cannot merge it now because of the nature of the releases of setuptools (we cannot risk PEP 639 support being released before PEP 685 support).

@cdce8p
Copy link
Contributor Author

cdce8p commented Nov 11, 2024

So we cannot implement PEP 639 before 685, otherwise metadata validation may fail.

The validation will only happen once we update the metadata version.
https://github.com/pypa/packaging/blob/24.2/src/packaging/metadata.py#L826-L831

If not, a lot of wheel uploads would probably fail as we currently include License-File metadata that's not up to spec. #3596 (Fix: #4728).

What I'm proposing here is simply to add the basic support to setuptools. It won't get validated by PyPI until we're ready to move to 2.4 but then all projects using the "new" syntax for pyproject.license will already be ready to go.

@abravalheri
Copy link
Contributor

For sure the existence of the License-File in the setuptools-produced metadata with version 2.1 is violating the specs. However, this is a well known violation that the Python ecosystem has grown acquainted to since the early prototype implementation of PEP 639, and for better or worse, learned how to deal with.

By accepting only some aspects of PEP 639, we add other kinds of spec violations that some tools may not be prepared to deal with.

So for the sake of reducing the unknown risks, I propose we pipeline the changes you kindly contributed after the implementation of 685. Unfortunately, it means that people cannot start modifying their pyproject.toml immediately, but it feels safer.

@abravalheri
Copy link
Contributor

abravalheri commented Nov 11, 2024

Sorry, I have been using the wrong PEP number, sorry for the confusion.

PEP 685 is important, but the really big one for us to implement before 639 is PEP 643 (the one with Dynamic).

@cdce8p
Copy link
Contributor Author

cdce8p commented Nov 12, 2024

So for the sake of reducing the unknown risks, I propose we pipeline the changes you kindly contributed after the implementation of 685. Unfortunately, it means that people cannot start modifying their pyproject.toml immediately, but it feels safer.

I understand your reasoning. One last suggestion. The last days I've been looking into how hatchling does it as I noticed it was one of the earliest adopters of License-Expression in the metadata. Turns out the packaging update caused upload failures pypa/hatch#1786. I can't reproduce it with PyPI at the moment. However, packaging==24.2 rejects any License-Expression and License-File keys with metadata <2.4. That would be relevant for us too. After the issue report, they removed the PEP 639 keys from the metadata and now include the project.license as part of the License key.

My suggestion

What do you think?

@cdce8p cdce8p force-pushed the license-expression branch from 390b95f to 97e8f94 Compare November 25, 2024 00:29
@cdce8p cdce8p changed the title Add initial support for License-Expression (PEP 639) Add initial support for license expression (PEP 639) Nov 25, 2024
@cdce8p
Copy link
Contributor Author

cdce8p commented Nov 25, 2024

@abravalheri With the last updates, I removed the License-Expression field and instead back-populate the License field with the validated license expression. That's the approach currently taken by hatchling. Since it doesn't add any new fields, would it be ok to do now?

if "file" in val:
if isinstance(val, str):
_set_config(dist, "license_expression", _static.Str(val))
elif "file" in val:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cdce8p I updated this part of the code to solve the merge conflict. I hope that is OK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Feel free to ping me, especially if one of my PRs need attention. I'm usually fairly quick to respond.

Small side note. If I review PRs from other people, I only update them with merge commits. The rebase one makes it unnecessarily difficult for the OP to follow and understand what's changed on their own PR. I myself only do force pushes if my PR hasn't been reviewed yet, afterwards only merge commits.

If that's ok with you, I'll force push the rebase once #4796 is done. Just so I'm aware what has changed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's ok with you, I'll force push the rebase once #4796 is done. Just so I'm aware what has changed.

Thanks @cdce8p. That would be helpful.
I usually try to avoid merging the full main branch back to a PR, because it messes up the graph visualisation (which becomes difficult to follow), that is why I went with a rebase, sorry if that disrupted your workflow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually try to avoid merging the full main branch back to a PR, because it messes up the graph visualisation (which becomes difficult to follow), that is why I went with a rebase, sorry if that disrupted your workflow.

Don't worry about it. Guess everyone has their own preferred workflow. Just wrote that as it took me a bit to actually see what you've changed during the rebase.

Thanks @cdce8p. That would be helpful.

👍🏻

pyproject.toml Outdated
@@ -11,10 +11,10 @@ authors = [
]
description = "Easily download, build, install, upgrade, and uninstall Python packages"
readme = "README.rst"
license = "MIT"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit complicated.

Setuptools has many vendored dependencies (which may vary with time), so the correct license expression is not that simple. Additionally, I don't know if it is going to be trivial to derive a correct expression automatically from the code base at each release (e.g. what happens if a maintainer change a vendored dependency? how can we guarantee the license expression is up-to-date?)...

Would it make more sense to simply omit the license expression and rely on license-files or an automatic scan of the relevant directories?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me omit it for now. I mainly used MIT here as that matches the existing classifier License :: OSI Approved :: MIT License but I'm aware of the "vendor" dilemma. This is something which can be discussed at a later point.

IMNAL: During the discussion on the Python packaging docs I made the argument that vendoring, if done correctly, usually doesn't change the project license as long as all vendored dependencies are included and licenses correctly.

pypa/packaging.python.org#1662 (comment)
https://discuss.python.org/t/pep-639-improving-license-clarity-with-better-package-metadata/2154/39

@@ -88,6 +88,7 @@ def read_pkg_file(self, file):
self.url = _read_field_from_msg(msg, 'home-page')
self.download_url = _read_field_from_msg(msg, 'download-url')
self.license = _read_field_unescaped_from_msg(msg, 'license')
self.license_expression = _read_field_unescaped_from_msg(msg, 'license_expression')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this use an _ (underscore) or - (dash) for the message field string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should be -. Will update it.

_set_config(dist, "license", _static.Str(value))
dist._referenced_files.add(val["file"])
else:
_set_config(dist, "license", _static.Str(val["text"]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to flatten this nested else{if, else} into a elif,else as in the suggested change:

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this as well. However I decided to use to nesting to clearly separate the before / after PEP 639 cases. In a flattened list it just looks like they are equal which isn't the case.

At some point, it might also make sense to add a DeprecationWarning for the legacy case which will be easier with the nested style.

__doc__ = _DESC
_URL = "https://peps.python.org/pep-0735/"


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am supposing the reason for these files to be changed is that this PR is cascaded on top of #4734. So ideally we should merge that other one first and possibly rebase. I have changed the PR to target the feature/pep639 branch so that we can do that without risking an intermediary/incomplete release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am supposing the reason for these files to be changed is that this PR is cascaded on top of #4734.

Correct. We need abravalheri/validate-pyproject#210 and abravalheri/validate-pyproject#217 to be able to parse and validate project.license correctly. They are only available in v0.23.0.

So ideally we should merge that other one first and possibly rebase.

I created the other PR as I thought it might be easier to review. "Just a simple vendored dependency bump". It's identical to the first commit here. (A rebase will resolve it).

I have changed the PR to target the feature/pep639 branch so that we can do that without risking an intermediary/incomplete release.

Not sure that's necessary. Will leave a comment about it one the issue to keep it in a central place: #4759 (comment)

Comment on lines 420 to 428
raise InvalidConfigError(
"License classifier are deprecated in favor of the license expression. "
f"Remove the '{cl}' classifier."
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it is a deprecation, should we give a warning first before evolving to an error? (we can skip the deprecated classifier in the meantime).

Copy link
Contributor Author

@cdce8p cdce8p Feb 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be open to that as well. My reasoning for the error was that the users are updating the metadata anyway. Might as well error then. Deprecations tend to get unnoticed unfortunately, especially with CD pipelines.

Copy link
Contributor

@abravalheri abravalheri Feb 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do something like:

SetuptoolsDeprecationWarning.emit(
    "License classifier are deprecated in favor of the license expression.",
    f"Please remove the {cl!r} classifier.",
    see_url="https://peps.python.org/pep-0639/",
    due_date=(2027, 2, 17),  # Introduced in 2025/02/17
)

and exclude the classifier from the list.

This way the few uses that set PYTHONWARNINGS for the CI can catch and respond in time. Other users will always be upset about the change, but at least we can tell them that we did everything we could to communicate before starting crashing the builds.

@@ -175,7 +176,7 @@ def write_field(key, value):
if attr_val is not None:
write_field(field, attr_val)

license = self.get_license()
license = self.license_expression or self.get_license()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this mean that we need a follow-up PR for the License-Expression, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. This PR was designed so that it itself didn't require a metadata_version bump. That could / should happen at a later point once all pieces are in place. I.e. setuptools has adopted version 2.3, the license files are stored in the correct folder and license expression parsing is handled correctly.

@abravalheri
Copy link
Contributor

Thank you very much @cdce8p. How do you feel about rebasing this on the feature/pep639 branch?

@cdce8p cdce8p force-pushed the license-expression branch from 2b6ae9f to 0d8f1f2 Compare February 17, 2025 13:12
@cdce8p
Copy link
Contributor Author

cdce8p commented Feb 17, 2025

Thank you very much @cdce8p. How do you feel about rebasing this on the feature/pep639 branch?

Done. As expected it just removed the first commit here.

Before the rebase I pushed 0d8f1f2 to change the error to the SetuptoolsDeprecationWarning and remove the license classifier.

Let me know if there is anything else I should address here.

@abravalheri
Copy link
Contributor

Problem with cygwin in the CI possibly solved by #4832

Comment on lines 417 to 430
license_classifiers_found = False
for cl in self.metadata.get_classifiers():
if not cl.startswith("License :: "):
classifiers.append(cl)
continue
license_classifiers_found = True
SetuptoolsDeprecationWarning.emit(
"License classifier are deprecated in favor of the license expression.",
f"Please remove the '{cl}' classifier.",
see_url="https://peps.python.org/pep-0639/",
due_date=(2027, 2, 17), # Introduced 2025-02-17
)
if license_classifiers_found:
self.metadata.set_classifiers(classifiers)
Copy link
Contributor

@abravalheri abravalheri Feb 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if the user does not use a license expression in pyproject.toml, we are still going to emit license classifiers in the core metadata right? (because the for-loop is nested inside the if license_expr).

Should we *always skip license classifiers (with the warning) even if the project does not explicitly use a license expression (e.g. instead it relies on the license files)?

(It might also be the case your intention is to introduce a more timid version of the skip/warning in this PR, and later un-nest it and make it more prominent in a follow up. Please let me know if that is the case).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh I don't think setuptools should modify the classifiers in the first place. We should either emit an error or a warning to the user to do so themselves. Everything else would just be surprising in the end. Pushed a new commit to revert that part.

(It might also be the case your intention is to introduce a more timid version of the skip/warning in this PR, and later un-nest it and make it more prominent in a follow up. Please let me know if that is the case).

At some point it might make sense to add another warning to nudge users to use license expressions. However, I do not plan to add it at this time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants