New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(sourcemap): lazy compute decoded mappings #5075
perf(sourcemap): lazy compute decoded mappings #5075
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
@lukastaegert Let me know if you can take a look at this. I started down this path because I was running OOM during a build on CI and I'm hoping this change can make a dent in improving vitejs/vite#2433. It didn't make enough of a difference in my particular use-case, but perhaps it lowers the bar for when this issue occurs. |
e3b636b
to
f4bca4f
Compare
Review has been updated based on feedback. Adding on another datapoint, I tested: https://github.com/Axeldeblen/realworld-big-build-vite-4 The default configuration doesn't build sourcemaps, but we can still see some improvements {
stats: {
peakMemory: 4_148_762_536,
decodedSourcemapCalls: 16133,
decodeCalls: 16133
}
} After: ✓ built in 1m 59s {
stats: {
peakMemory: 3_926_596_880,
decodedSourcemapCalls: 16133,
decodeCalls: 0,
memoizedCalls: 0
}
} Unfortunately, if I enable sourcemaps in the config, its a bit more of a mixed bag. For this configuration we only really save 1 call to Before: ✓ built in 2m 17s - 2m 57s {
stats: {
peakMemory: 4_149_991_032,
decodedSourcemapCalls: 16139,
decodeCalls: 16139
}
} After: ✓ built in 2m 20s - 2m32s {
stats: {
peakMemory: 4_150_411_624,
decodedSourcemapCalls: 16139,
decodeCalls: 16138,
memoizedCalls: 14130,
}
} |
@lukastaegert I have iterated on this idea a little further and have some additional gains. For a complete description, see the second commit: 4275894 This change improves on the peak memory usage. I did some binary searching in both my internal enterprise app, and also in https://github.com/Axeldeblen/realworld-big-build-vite-4 to identify the lowest Internal Vue app:
realworld-big-build-vite-4 (+ sourcemaps: true):
realworld-big-build-vite-4 (default config w/ sourcemaps: false):
|
The only minor issue I have with this PR that could possibly be solved by a breaking change is that the |
1ca20dd
to
4275894
Compare
4275894
to
1aa35c8
Compare
Codecov Report
@@ Coverage Diff @@
## master #5075 +/- ##
==========================================
- Coverage 98.97% 98.95% -0.03%
==========================================
Files 226 226
Lines 8362 8390 +28
Branches 2294 2302 +8
==========================================
+ Hits 8276 8302 +26
Misses 31 31
- Partials 55 57 +2
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. If you are fine from your side, I will merge this tomorrow.
Thanks for the review @lukastaegert I have pushed a commit dropping the setter. This is good to merge / release from my end when you are ready. |
By deferring the decoding of the sourcemap mappings field, we can avoid extra computation and garbage generation, both of which appear to save a significant amount of time. In my test application (using vite), I saw a reduction in total build time of about 28.4%, and a reduction in peak heap size of 19.6% (706 MB) Project Info: * Node: v18.17.0 * Vite v4.4.7 * Rollup v3.26.3 * Module Count: 5490 modules transformed * Chunk Count: 54 chunks rendered Before: * Average vite build time (5 runs): 30.89 seconds * Average peak heap size (5 runs): 3594.83 MB * Time spent in GC: ~13.7 seconds * Time spent in `decode()`: 3.11 seconds After: * Average vite build time (5 runs): 22.13 seconds * Average peak heap size (5 runs): 2888.2 MB * Time spent in GC: ~5.9 seconds * Time spent in `decode()`: 1.70 seconds
With the previous commit, we defer decoding the mappings field until we know they are going to be used. This change aims to further reduce peak heap usage by allowing for the decoded arrays to be cleaned up sooner instead of retained until the end of the build. When we are likely to be done with the `originalSourcemap` and `sourcemapChain`, we can re-encode them which allows for the larger decoded array to be garbage collected while retaining the smaller VLQ encoded strings instead. This commit also makes a change to the previous commit's caching behavior in that the original encoded mapping is no longer being held onto once we have a decoded array (this is closer to the original behavior pr and reduces memory usage further). This does mean, however, that we need need to recalculate the encoded string from the decoded arrays instead of just dropping the decoded array reference. Somewhat counter-intuitively, I found re-encoding to be both faster (2%) and use less memory than the variant that kept the original encoded string around. There are two places where we are currently reverting from decoded -> encoded mappings: 1) At module.setSource which is passed sourcemaps from transform or from the rollup cache 2) As part of collapse/folding the module sourcemaps during renderChunk For the transform case, that operation will frequently be a noop since the decoding is now lazy and would not have been invoked unless a plugin called `getCombinedSourcemap()` explicitly. If that were the case, the combined sourcemap would have its decoded mappings thrown away before the sourcemaps are stored on the module resulting in them potentially being decoded a second time when the chunk is rendered. The second case, renderChunk, appears to make up the majority of the memory usage reduction for me. It is often times during rendering chunks that I hit OOM errors when building with lower `--max-old-space-size` settings. By aggressively removing the cached decoded mappings arrays when we are done with them, it gives more headroom for the build to succeed. This effect is particularly noticable when working with large sourcemaps.
Removing the setter for the mappings field reduces complexity of the cache behavior somewhat, and increases test coverage of the feature. We expect the cases where someone might be modifying mappings on an existing sourcemap to be rare. I think a user would need to hook into the cache and modify it there. For anyone doing that, this would now fail because the property is not writeableso I have modified the type to reflect this readonly nature. The workaround for any users that encounter problems here is to just replace the entire sourcemap. ```ts module.originalSourceMap = { ...module.originalSourceMap, mappings: myNewMappings }; ```
d375fce
to
96ab5b1
Compare
This PR has been released as part of rollup@3.27.1. You can test it via |
Technically this is also covered by a fresh install / lock-file maintenance, but I thought it might be good to call it out with a specific change-log notification. On my own projects, I've seen around 28% faster builds, with a 1.3 GB (36.6%) reduction in peak memory usage. Which should hopefully help with vitejs#2433 Rollup 3.27.1 contains a change (rollup/rollup#5075) that improves sourcemap handling for rollup builds. These improvements seem to have the most impact where there is large amounts of treeshaking since sourcemaps are no longer being decoded in memory for files that are tree-shaken from the final bundle, but it also releases those decoded mappings arrays when they are no longer needed which frees up more memory during render chunks. It also reduces memory usage and build times for projects that don't set `build.sourcemaps: true` in their vite config. For those projects, any sourcemaps loaded from dependencies in node_modules, etc, were being decoded proactively in the event rollup needed to trace the sourcemap to find the locations for any warnings/errors. Now that process only happens if they are needed, and a fair amount of processing and allocations can be avoided.
This PR has been released as part of rollup@3.27.2. You can test it via |
This reverts commit a1be53f.
Adding back the improvements from rollup#5075
This PR contains:
Are tests included?
Breaking Changes?
List any relevant issue numbers:
Description
By deferring the decoding of the sourcemap mappings field, we can avoid extra computation and garbage generation, both of which appear to save a significant amount of time.
In my test application (using vite), I saw a reduction in total vite build time of about 28.4% (8.76s), and a reduction in peak heap size of 19.6% (706 MB). [Update] The latest commit reduces the peak heap size by 36.6% in total (1,300 MB) for projects that deal with large sourcemaps. While the first commit reduces allocations, the second commit technically allocates more, but frees up more data to be garbage collected which has a net positive effect. The stats below reflect only the first commit, a summary of the two commits combined can be found in this comment: #5075 (comment)
Project Info:
Before:
decode()
: ~3.11 secondsAfter:
decode()
: ~1.70 secondsI'm happy to try to write tests for this if it is needed, but as I'm still getting my feet wet in this project that might take me a little while to figure out exactly how to go about it.
I'm also a little curious if there is a higher-level optimization to be applied that would make this no longer necessary. For example, what are the cases where we are calling
decodedSourcemap
, but then abandoning the result without ever actually accessing the mappings value?Are we able to avoid call to
decodedSourcemap
at all and/or any json parsing until we know we will need it to be decoded?Are there cases where we decode the sourcemap only to pass it back into the SourceMap constructor which then re-encodes it to a string?
While I think those are all good questions to ask, the current solution provides enough benefit and seems relatively low risk, so IMO its worth pursuing on its own.
Allocations Before:
Allocations After: