Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: improve Composer's output reproducibility #11663

Merged
merged 5 commits into from Sep 28, 2023
Merged

feat: improve Composer's output reproducibility #11663

merged 5 commits into from Sep 28, 2023

Conversation

drupol
Copy link
Contributor

@drupol drupol commented Sep 26, 2023

This PR:

Reproduce this at home, inside the composer source directory:

image

@drupol drupol changed the title feat: enable deterministic autoloader files feat: enable deterministic vendor files Sep 27, 2023
@Seldaek
Copy link
Member

Seldaek commented Sep 27, 2023

Sorry but I am really confused as to what this tries to achieve? :) It seems to touch() all vendor files with a 0 timestamp. What's the point? And why would you need this flag vs doing it manually with find vendor -type f -exec touch {} 0 or smth if it's a special need it sounds ok to me?

Also not sure how it relates to #7049 at all.. But then again there are so many posts talking about different things in that thread, maybe I missed your point.

@drupol
Copy link
Contributor Author

drupol commented Sep 27, 2023

Hi @Seldaek,

I apologize for any confusion. This PR is part of an initiative to achieve reproducible builds with composer. The aim is to ensure that composer can create a vendor/ directory that is bit-for-bit identical across multiple runs of the same command.

By zeroing out the timestamps of all vendor/ files, we eliminate one variable that could make builds non-deterministic. Integrating this feature into composer itself would streamline the process and make it more accessible for users focused on reproducibility (debian, NixOS, Arch, etc etc).

This is totally related to #7049 from @lstrojny since the goal of that PR was precisely what I'm doing here. Ensuring reproducible vendor/ directories aligns with this goal.

For more context on why reproducible builds are important, I'd refer you to this page: https://reproducible-builds.org/docs/buy-in/

I've added a screenshot demonstrating the default non-deterministic behavior of Composer. Running composer install twice resulted in two different directory hashes. However, when using the new --deterministic flag, the directory hashes were identical in multiple runs.

I believe this feature would be a significant step in making PHP applications more secure and predictable through Composer. Additionally, this would greatly benefit Linux distributions that aim to package PHP applications in a secure and predictable manner.

@Seldaek
Copy link
Member

Seldaek commented Sep 27, 2023

Yeah ok, I have already fixed a ton of issues with reproducibility so you don't need to convince me, but this one here seems like the nuclear option to me.

We could perhaps by default copy the filemtime of the composer.lock to the files Composer creates in vendor? That would be more sensible IMO than 0, and the other files in vendor should not need modifying as these have filemtime controlled by their own package source, and zeroing these feels really wrong to me.

On the other hand, we also touch the vendor dir itself after dependencies get modified (https://github.com/composer/composer/blob/main/src/Composer/Installer.php#L801), but this probably could be changed to match the lock file's mtime? I think it would still achieve the original goal of #2764

@drupol
Copy link
Contributor Author

drupol commented Sep 27, 2023

Yes, this is also an option, as long as this is predictable, it's fine.

I'll modify the PR accordingly right now and report back.

@drupol
Copy link
Contributor Author

drupol commented Sep 27, 2023

Just made the modifications, this is all good !

image

@Seldaek
Copy link
Member

Seldaek commented Sep 27, 2023

Ok so could we do this without --deterministic flag? I'd really rather have these things just work out of the box, and avoid special behavior where possible.

Also could the touch be done directly in the AutoloadGenerator instead of twice in Installer & DumpAutoloadCommand?

Regarding the autoload suffix, I think we could always use the lock file hash if there is one, that's probably unique enough. If there is no lock file available then just pass null and have it default to generating one or reusing existing one. You anyway won't have anything deterministic without lock file so it shouldn't matter in that case.

@Seldaek
Copy link
Member

Seldaek commented Sep 27, 2023

I'd really rather have these things just work out of the box, and avoid special behavior where possible.

This is assuming you don't need to carpet bomb the whole vendor dir with touch, because this is IMO unacceptable behavior to modify vendor files.

@drupol
Copy link
Contributor Author

drupol commented Sep 27, 2023

I definitely agree that this should be the default behavior.

I reworked the PR from the beginning and this is now the default behavior:

image

I still need to fix the tests and add new tests.

src/Composer/Installer.php Outdated Show resolved Hide resolved
src/Composer/Installer.php Outdated Show resolved Hide resolved
@Seldaek Seldaek added this to the 2.7 milestone Sep 27, 2023
@drupol drupol changed the title feat: enable deterministic vendor files feat: improve Composer's output reproducibility Sep 27, 2023
@drupol drupol marked this pull request as ready for review September 27, 2023 20:06
@drupol drupol marked this pull request as draft September 27, 2023 20:06
@drupol drupol marked this pull request as ready for review September 27, 2023 20:17
doc/01-basic-usage.md Outdated Show resolved Hide resolved
@Seldaek
Copy link
Member

Seldaek commented Sep 28, 2023

LGTM aside from the two minor comments, thanks for the quick turnaround here. Given the scope change I now think it's safe to move this to 2.6 too.

@Seldaek Seldaek modified the milestones: 2.7, 2.6 Sep 28, 2023
@Seldaek Seldaek merged commit b608b8e into composer:main Sep 28, 2023
20 checks passed
@Seldaek
Copy link
Member

Seldaek commented Sep 28, 2023

Thanks

@drupol
Copy link
Contributor Author

drupol commented Oct 5, 2023

@Seldaek Do you think it would be possible to edit the Github changelog and add a line about this, so I can spread the word about it ?

@Seldaek
Copy link
Member

Seldaek commented Oct 6, 2023

Sure, added to https://github.com/composer/composer/releases/tag/2.6.4 sorry :)

@drupol
Copy link
Contributor Author

drupol commented Oct 12, 2023

Published in the September report of Reproducible Builds initiative! https://reproducible-builds.org/reports/2023-09/

drupol added a commit to NixOS/nixpkgs that referenced this pull request Oct 14, 2023
Since composer/composer#11663, the composer output is stable and reproducible, therefore, that prefix is now obsolete.
fabaff pushed a commit to fabaff/nixpkgs that referenced this pull request Oct 17, 2023
Since composer/composer#11663, the composer output is stable and reproducible, therefore, that prefix is now obsolete.
davidrjenni added a commit to davidrjenni/scip-php that referenced this pull request Oct 28, 2023
Since composer 2.6.4 [0], the autoloader suffix reuses the content-hash
from the lock file in order to make the builds more reproducible [1].
This lead to a conflict when analyzing scip-php itself when using the
Docker container.

[0] https://github.com/composer/composer/releases/tag/2.6.4
[1] composer/composer#11663
davidrjenni added a commit to davidrjenni/scip-php that referenced this pull request Oct 28, 2023
Since composer 2.6.4 [0], the autoloader suffix reuses the content-hash
from the lock file in order to make the builds more reproducible [1].
This lead to a conflict when analyzing scip-php itself when using the
Docker container.

[0] https://github.com/composer/composer/releases/tag/2.6.4
[1] composer/composer#11663
tscni added a commit to tscni/psalm that referenced this pull request Nov 21, 2023
The issue was likely caused by Composer 2.6.4 making the autoloader generation (more) reproducible (composer/composer#11663)

We can either try to change the generated autoloader with an autoloader suffix, or just change the Psalm root directory for the smoke test. The latter approach seems easier. :P
jtojnar added a commit to fossar/selfoss that referenced this pull request Jan 6, 2024
Composer 2.6.4+ wants to have reproducible outputs so it preserves `mtime` when copying files:
composer/composer#11663
This becomes an issue with `vendor/composer/ClassLoader.php`,
which is copied from Nix store so it has `mtime` of 0 (1970-01-01),
because that is not supported by ZIP:

    ValueError: ZIP does not support timestamps before 1980

Let’s disable strict timestamps to clamp the out-of-range timestamps to ZIP epoch.
That argument was actually introduced because of reproducible builds:
https://bugs.python.org/issue34097
jtojnar added a commit to jtojnar/entries that referenced this pull request Feb 17, 2024
Composer 2.6.4+ wants to have reproducible outputs so it preserves `mtime` when copying files:
composer/composer#11663
This becomes an issue with `vendor/composer/ClassLoader.php`,
which is copied from Nix store so it has `mtime` of 0 (1970-01-01),
because that is not supported by ZIP:

    ValueError: ZIP does not support timestamps before 1980

Let’s disable strict timestamps to clamp the out-of-range timestamps to ZIP epoch.
That argument was actually introduced because of reproducible builds:
https://bugs.python.org/issue34097
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants