Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(plugin-legacy): improve deterministic polyfills discovery #16566

Merged
merged 3 commits into from
May 29, 2024

Conversation

shankerwangmiao
Copy link
Contributor

Description

The polyfill bundle generated by the legacy plugin is non deterministic, because the order of calling renderChunk() affects the input of the finally generated polyfill bundle.

This PR solves this issue by sorting the polyfills in need discovered by babel. The polyfills added directly from the plugin configuration is kept ordered the first, same as before.

Previously, #13672 (comment) mentioned this issue.

Copy link

stackblitz bot commented Apr 30, 2024

Review PR in StackBlitz Codeflow Run & review this pull request in StackBlitz Codeflow.

Copy link
Member

@bluwy bluwy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we sort (and convert to array for) the polyfills before these two spots instead?

isDebug &&
console.log(
`[@vitejs/plugin-legacy] modern polyfills:`,
modernPolyfills,
)

isDebug &&
console.log(
`[@vitejs/plugin-legacy] legacy polyfills:`,
legacyPolyfills,
)

We can then pass the sorted array to buildPolyfillChunk() below them and it should be deterministic?

@bluwy bluwy added plugin: legacy p2-nice-to-have Not breaking anything but nice to have (priority) labels May 1, 2024
@shankerwangmiao
Copy link
Contributor Author

Could we sort (and convert to array for) the polyfills before these two spots instead?

isDebug &&
console.log(
`[@vitejs/plugin-legacy] modern polyfills:`,
modernPolyfills,
)

isDebug &&
console.log(
`[@vitejs/plugin-legacy] legacy polyfills:`,
legacyPolyfills,
)

We can then pass the sorted array to buildPolyfillChunk() below them and it should be deterministic?

Hi, thanks for your review. It could be simpler to directly sort the array here. However, I think the ordering of polyfill groups should be kept as before. Originally, the corejs polyfills specified in options.{modern,legacy}Polyfills comes the first, followed by those specified in additional{Modern,Legacy}Polyfills, and those discovered by babel. If we directly sort the combined polyfills list, this ordering will be broken, and some projects may rely on this to work properly. This proposed change keeps this ordering, and only sort the polyfills generated by babel.

@shankerwangmiao
Copy link
Contributor Author

Also some issues with the CI. I found that the CI sometimes fails on tests not related with legacy plugin.

See:

They should be running on the same changes, but I have no idea why some of the tests failed

@bluwy
Copy link
Member

bluwy commented May 2, 2024

Yeah I'd ignore the CI fails that aren't related to plugin-legacy. It's flaky lately.

If we directly sort the combined polyfills list, this ordering will be broken, and some projects may rely on this to work properly.

I did some digging and found babel/babel-polyfills#192. So it seems like some order matters and presumably babel-preset-env will respect that when it returns. But in your PR, you're sorting the list too, which is incorrect?

In which case, the current behaviour is safer wrt ordering. Otherwise it's better to sort the entire array based based on core-js-compat/modules.json instead?

@shankerwangmiao
Copy link
Contributor Author

Ah, thanks for pointing it out. I'll see how to make the polyfill list stable and following the correct ordering requirement.

@shankerwangmiao shankerwangmiao force-pushed the fix-legacy-polyfill-deterministic branch from 2b44e8a to 3f53346 Compare May 2, 2024 15:56
@shankerwangmiao
Copy link
Contributor Author

I did some digging and found babel/babel-polyfills#192. So it seems like some order matters and presumably babel-preset-env will respect that when it returns. But in your PR, you're sorting the list too, which is incorrect?

In which case, the current behaviour is safer wrt ordering. Otherwise it's better to sort the entire array based based on core-js-compat/modules.json instead?

Hi, I believe sorting according to core-js-compat/modules.json is definitely correct. However, by doing that, we are skipping the abstract layer provided by babel/preset-env, so I don't think it will be a clean solution.

My idea is to record all the generated polyfills and their ordering in each chunk and generate a combined list of polyfills by doing a topological sort, and generating a lexicographically minimum one to make the result stable.

However, by implementing it, as you can see from the CI result, I discovered that the order of polyfills given by babel is not strictly following the order in core-js-compat/modules.json. For example, in playground/legacy, the polyfills in the legacy chunk assets/chunk-main-legacy.!~{003}~.js is

core-js/modules/es.symbol.js
core-js/modules/es.symbol.description.js
core-js/modules/es.symbol.async-iterator.js
core-js/modules/es.symbol.iterator.js
core-js/modules/es.symbol.to-string-tag.js
core-js/modules/es.error.cause.js
core-js/modules/es.array.concat.js
core-js/modules/es.array.from.js
core-js/modules/es.array.iterator.js
core-js/modules/es.array.join.js
core-js/modules/es.array.map.js
core-js/modules/es.array.push.js
core-js/modules/es.function.name.js
core-js/modules/es.json.stringify.js
core-js/modules/es.json.to-string-tag.js
core-js/modules/es.map.js
core-js/modules/es.math.to-string-tag.js
core-js/modules/es.object.freeze.js
core-js/modules/es.object.get-prototype-of.js
core-js/modules/es.object.keys.js
core-js/modules/es.array.slice.js
core-js/modules/es.object.to-string.js
core-js/modules/es.promise.js
core-js/modules/es.set.js
core-js/modules/es.string.ends-with.js
core-js/modules/es.regexp.exec.js
core-js/modules/es.regexp.test.js
core-js/modules/es.regexp.to-string.js
core-js/modules/es.string.iterator.js
core-js/modules/es.string.raw.js
core-js/modules/esnext.iterator.constructor.js
core-js/modules/esnext.iterator.for-each.js
core-js/modules/esnext.iterator.map.js
core-js/modules/esnext.set.difference.v2.js
core-js/modules/esnext.set.intersection.v2.js
core-js/modules/esnext.set.is-disjoint-from.v2.js
core-js/modules/esnext.set.is-subset-of.v2.js
core-js/modules/esnext.set.is-superset-of.v2.js
core-js/modules/esnext.set.symmetric-difference.v2.js
core-js/modules/esnext.set.union.v2.js
core-js/modules/web.dom-collections.for-each.js
core-js/modules/web.dom-collections.iterator.js
core-js/modules/web.dom-exception.constructor.js
core-js/modules/web.dom-exception.stack.js
core-js/modules/web.dom-exception.to-string-tag.js
core-js/modules/web.structured-clone.js
core-js/modules/web.url.js
core-js/modules/web.url.to-json.js
core-js/modules/web.url-search-params.js
core-js/modules/web.url-search-params.delete.js
core-js/modules/web.url-search-params.has.js
core-js/modules/web.url-search-params.size.js

We can see core-js/modules/es.array.slice.js is ordered after core-js/modules/es.object.keys.js which is opposite to the ordering in core-js-compat/modules.json.

I wonder whether this is a bug in babel or intended.

@shankerwangmiao shankerwangmiao force-pushed the fix-legacy-polyfill-deterministic branch from 3f53346 to 9a98cf4 Compare May 2, 2024 18:45
@shankerwangmiao
Copy link
Contributor Author

This issue is addressed by adding a circle breaking mechanism.

@yoyo837
Copy link
Contributor

yoyo837 commented May 7, 2024

Thanks for the great job. Really looking forward to this fix.

@bluwy
Copy link
Member

bluwy commented May 8, 2024

Thanks for investigating this @shankerwangmiao. Seems like babel-preset-env might have a bug with the ordering but I can't find out why.

From the current implementation, I don't understand why we need a complex sorting algorithm to sort it though. Why not do a single final sort based on core-js-compat/modules.json? The sort function should be something like this.

@shankerwangmiao
Copy link
Contributor Author

From the current implementation, I don't understand why we need a complex sorting algorithm to sort it though. Why not do a single final sort based on core-js-compat/modules.json? The sort function should be something like this.

Because there is no insurance that someday babel won't add other polyfill dependency from other packages, I believe not directly relying on the core-js-compat/modules.json, but relying on solely the output of babel would be a better approach.

@bluwy
Copy link
Member

bluwy commented May 8, 2024

If babel adds another polyfill, wouldn't that still be fine? It shouldn't be having a specific order if it's not core-js, but if it needs a specific order, we can add it later too.

Perhaps what I don't understand too is that, if babel-preset-env outputs imports that aren't in the correct order, how does a sorting algorithm fix into a correct order, and especially would always respect core-js-compat/modules.json without reading it?

I can see that it's in the correct order now locally, but will it always be in the correct order if, let's say, there's only one chunk which babel outputs in an incorrect order?

@shankerwangmiao
Copy link
Contributor Author

If babel adds another polyfill, wouldn't that still be fine? It shouldn't be having a specific order if it's not core-js, but if it needs a specific order, we can add it later too.

Perhaps what I don't understand too is that, if babel-preset-env outputs imports that aren't in the correct order, how does a sorting algorithm fix into a correct order, and especially would always respect core-js-compat/modules.json without reading it?

I can see that it's in the correct order now locally, but will it always be in the correct order if, let's say, there's only one chunk which babel outputs in an incorrect order?

The basic assumption is that the ordering given by babel is (or should be) always right, no matter whether the polyfills are from core-js or other sources. We utilize this information rather than the information from core-js-compat/modules.json to keep all the polyfills in the correct order. The "correct order" means that if a polyfill A comes before B in a chunk, A will come before B in the final output; if a polyfill A and a polyfill B never appear in the same chunk, there will be no ordering constrain between them.

@bluwy
Copy link
Member

bluwy commented May 9, 2024

Even if babel is correctly outputting in the right order today, there's still the problem where additionalModernPolyfills or additionalLegacyPolyfills get merged and there are not in the right order. Having in the incorrect order is very unlikely to be intended, so we should also sort and fix it up.

I get that fixing the ordering issue wasn't the initial goal of this PR, it was to only make it deterministic. But knowing now that the bug exist, and the way to solve this is to sort based on core-js-compat/modules.json, we can do that directly to kill two birds with one stone? Otherwise we have to followup with another PR that does that.

@shankerwangmiao
Copy link
Contributor Author

You are right about this. My original point is to only keep the output deterministic, not to solve the ordering issue.

If we sort according to core-js-compat/modules.json, what if the user choose to include polyfills from other than those provided by core-js in additional*Polyfills? Also, currently, additional*Polyfills comes before the polyfills imported by babel, what if the user relies on the ordering?

@bluwy
Copy link
Member

bluwy commented May 9, 2024

What I had in mind is that, if there's any polyfills that aren't within core-js-compat/modules.json, then we keep the ordering as-is. However, it is more complex I think 🤔 For example:

// es.array.slice must be before es.object.keys

// Case 1:
const additional =  ['core-js/modules/es.array.slice.js', 'something-else.js']
const babel = ['core-js/modules/es.object.keys.js']
const result = ['core-js/modules/es.array.slice.js', 'something-else.js', 'core-js/modules/es.object.keys.js']

// Case 2:
const additional =  ['core-js/modules/es.object.keys.js', 'something-else.js']
const babel = ['core-js/modules/es.array.slice.js']
const result = ['core-js/modules/es.array.slice.js', 'core-js/modules/es.object.keys.js', 'something-else.js']
// Move es.array.slice before es.object.keys

I'm not sure if a .sort algorithm would be enough to do the trick, but I think keeping the basic idea of moving things to the front should make sure non-corejs modules are not simply moved entirely to last. In other words, non-corejs modules if it has another polyfill before it, there shouldn't be something injected between them. I might need to think about this more.

@shankerwangmiao
Copy link
Contributor Author

I guess a simple sort is not enough. Maybe we need topology sorting. However, I have no idea on how to deal with contradictory ordering constrains.

@Huodoo
Copy link

Huodoo commented May 14, 2024

I guess a simple sort is not enough. Maybe we need topology sorting. However, I have no idea on how to deal with contradictory ordering constrains.

Maybe a sorting algorithm is enough. This will just sort the polyfill import statement

@bluwy
Copy link
Member

bluwy commented May 14, 2024

I had a few ideas recently to solve the ordering issue, but all are quite complex. @shankerwangmiao I'm happy to solve the deterministic issue for now, but I think we could simplify the implementation.

The forth argument of renderChunk has meta.chunks. This object is properly ordered. On the first renderChunk call, we can create a map like this:

chunkFileNameToPolyfills = new Map<string, { modern: Set<string>, legacy: Set<string> }>();
for (const fileName in meta.chunks) {
  chunkFileNameToPolyfills.set(fileName,  { modern: new Set(), legacy: new Set() }
}

And then later on the code in renderChunk will always only populate the sets in chunkFileNameToPolyfills. And finally in generateBundle, we can iterate chunkFileNameToPolyfills and combine it all back into the main modernPolyfills and legacyPolyfills. What do you think?

@shankerwangmiao shankerwangmiao force-pushed the fix-legacy-polyfill-deterministic branch from 9a98cf4 to e1ca296 Compare May 14, 2024 10:05
@shankerwangmiao
Copy link
Contributor Author

shankerwangmiao commented May 14, 2024

I followed your suggestion, but with a simpler solution. In renderChunk(), chunkName and polyfills are recorded in an array. In generateBundle(), we simply sort it according the chunkName, and combine the polyfills together.

The change made to the original process is simple: only the order of polyfill groups generated by chunks and pushed to {legacy,modern}Polyfills is changed.

@shankerwangmiao shankerwangmiao force-pushed the fix-legacy-polyfill-deterministic branch from e1ca296 to 9573279 Compare May 14, 2024 10:12
@Huodoo
Copy link

Huodoo commented May 14, 2024

I had a few ideas recently to solve the ordering issue, but all are quite complex. @shankerwangmiao I'm happy to solve the deterministic issue for now, but I think we could simplify the implementation.

The forth argument of renderChunk has meta.chunks. This object is properly ordered. On the first renderChunk call, we can create a map like this:

chunkFileNameToPolyfills = new Map<string, { modern: Set<string>, legacy: Set<string> }>();
for (const fileName in meta.chunks) {
  chunkFileNameToPolyfills.set(fileName,  { modern: new Set(), legacy: new Set() }
}

And then later on the code in renderChunk will always only populate the sets in chunkFileNameToPolyfills. And finally in generateBundle, we can iterate chunkFileNameToPolyfills and combine it all back into the main modernPolyfills and legacyPolyfills. What do you think?

modernPolyfills and legacyPolyfills is just used for generate a file like

import "core-js/modules/es.regexp.exec.js";import "core-js/modules/es.regexp.test.js";import "core-js/modules/es.regexp.to-string.js";import "core-js/modules/es.string.match.js";import "core-js/modules/es.string.replace.js";

#16668

@bluwy
Copy link
Member

bluwy commented May 14, 2024

@Huodoo please read the above discussion, particularly #16566 (comment). The import order matters.

@shankerwangmiao shankerwangmiao force-pushed the fix-legacy-polyfill-deterministic branch from 9573279 to 4eee61f Compare May 14, 2024 15:45
@shankerwangmiao shankerwangmiao force-pushed the fix-legacy-polyfill-deterministic branch from 4eee61f to 37c344f Compare May 14, 2024 16:17
@shankerwangmiao
Copy link
Contributor Author

ping @bluwy

@bluwy
Copy link
Member

bluwy commented May 28, 2024

I'm not sure if I missed something, but I pushed a refactor implementing #16566 (comment) which I felt was simpler and has less memory-copying with the data structures. I also confirmed that the discovered modern and legacy polyfills were still the same order.

@shankerwangmiao
Copy link
Contributor Author

I'm not sure if I missed something, but I pushed a refactor implementing #16566 (comment) which I felt was simpler and has less memory-copying with the data structures. I also confirmed that the discovered modern and legacy polyfills were still the same order.

I also implemented that in 37c344f, before I pinged you. I think your implementation is better. I've tested the refactored version and the output remains the same.

I wonder if we are ready to have it merged.

Copy link
Member

@sapphi-red sapphi-red left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this implementation is based on the assumption that:

  1. the injection order we need to follow for core-js is determined by a dependency graph rather than a list (i.e. modules.json is a list, but actually it includes some orders that doesn't need to be followed)
  2. babel injects all the ancestor polyfills (polyfills which is required by the polyfills injected for the source code)
  3. babel injects the polyfills in the correct order

I guess all the assumptions are met and I think this would work.

@bluwy
Copy link
Member

bluwy commented May 29, 2024

@sapphi-red I think at the moment we had discovered that Babel doesn't necessarily return the polyfills in the right order, and there's definitely a chance the order is incorrect due to different chunks that can contain different code. But in my comment I figured that it's easier to tackle this deterministic issue first and figure out the ordering later as it can be tricky to sort with non corejs modules (sorry for the long discussion here).

@sapphi-red
Copy link
Member

Ah, I see. Then, I don't think this will break things so I think it's fine.

@bluwy bluwy changed the title fix(legacy): improve deterministic build of the polyfill bundle by sorting the polyfills discovered by babel fix(plugin-legacy): improve deterministic polyfills discovery May 29, 2024
@bluwy bluwy merged commit 48edfcd into vitejs:main May 29, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p2-nice-to-have Not breaking anything but nice to have (priority) plugin: legacy
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants