Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further improvements to performance of hash regex lookups in RealContentHashPlugin #16791

Closed
ryanwilsonperkin opened this issue Mar 8, 2023 · 10 comments

Comments

@ryanwilsonperkin
Copy link
Contributor

Feature request

Continuing idea for improvements carried over from #16759 (comment)

What is the expected behavior?

Calculation of whether or not a hash has matched can be reused in our logic for performing hash replacements
Per @alexander-akait (source):

I think there is a another potential improvement - we can store location of found hashRegExps and then avoid using result.replace in hashReplace, , I think it will be faster, but this need to be checked and we can do it latest and merge this as is, because we alreadyh have a good improvement here, so approved

What is motivation or use case for adding/changing the behavior?
Performance speed of RealContentHashPlugin

How should this be implemented in your opinion?

Are you willing to work on this yourself?
yes

@alexander-akait
Copy link
Member

@ryanwilsonperkin Feel free to experiment with it, v8 optimizes some regular expressions so we really need to make sure we will have better perf 👍

@dmichon-msft
Copy link
Contributor

When using RealContentHashPlugin, there's also a performance opportunity to not have the original contenthash values be hashes at all, but instead be predictably-shaped unique tokens that can be matched as a single pattern (e.g. __WEBPACK_CONTENT_HASH__${id}). This would allow matching to be performed by a much simpler RegExp and to skip original generation of contentHash (since it will get thrown out anyway).

@alexander-akait
Copy link
Member

@dmichon-msft Sounds good idea, but I am afraid it will require to rewrite some plugin outside of webpack ecosystem

@vankop
Copy link
Member

vankop commented Mar 10, 2023

@dmichon-msft thats not possible.. the hole idea is that you can plug-in/plug-out plugins..

@dmichon-msft
Copy link
Contributor

@dmichon-msft thats not possible.. the hole idea is that you can plug-in/plug-out plugins..

With an appropriate hook in content hash generation, RealContentHashPlugin could override the original content hash generation to account for that it is going to get completely replaced. Alternatively, it could become a standard step in the webpack process that content hashes start as a marker and get replaced during a later hook (that other plugins can tap into). I've used the token-then-replace model in a few plugins to-date because it stabilizes output and makes, e.g. minification more cacheable.

Equally relevant to optimizing this scenario is that the full list of referenced hashes in a given asset comes from the code generation process; we ought to be able to annotate the asset info with which hashes were injected into the asset at the moment they are being injected, without having to do a RegExp search through the final code.

@webpack-bot
Copy link
Contributor

This issue had no activity for at least three months.

It's subject to automatic issue closing if there is no activity in the next 15 days.

@alexander-akait
Copy link
Member

bump

@hardfist
Copy link
Contributor

When using RealContentHashPlugin, there's also a performance opportunity to not have the original contenthash values be hashes at all, but instead be predictably-shaped unique tokens that can be matched as a single pattern (e.g. __WEBPACK_CONTENT_HASH__${id}). This would allow matching to be performed by a much simpler RegExp and to skip original generation of contentHash (since it will get thrown out anyway).

can this solve the hash conflict problem in realContentHash, we have internal projects have code contains the hash of chunk id and accidentlly wrongly replaced

@alexander-akait
Copy link
Member

@hardfist We can try, but there is a limitation - if a plugin take asset and try to parse, it can be failed, I still think how better to fix it...

@webpack-bot
Copy link
Contributor

Issue was closed because of inactivity.

If you think this is still a valid issue, please file a new issue with additional information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants