Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(sourcemap): lazy compute decoded mappings #5075

Merged

Conversation

thebanjomatic
Copy link
Contributor

@thebanjomatic thebanjomatic commented Jul 27, 2023

This PR contains:

  • bugfix
  • feature
  • refactor
  • documentation
  • other

Are tests included?

  • yes (bugfixes and features will not be merged without tests)
  • no

Breaking Changes?

  • yes (breaking changes will not be merged unless absolutely necessary)
  • no

List any relevant issue numbers:

Description

By deferring the decoding of the sourcemap mappings field, we can avoid extra computation and garbage generation, both of which appear to save a significant amount of time.

In my test application (using vite), I saw a reduction in total vite build time of about 28.4% (8.76s), and a reduction in peak heap size of 19.6% (706 MB). [Update] The latest commit reduces the peak heap size by 36.6% in total (1,300 MB) for projects that deal with large sourcemaps. While the first commit reduces allocations, the second commit technically allocates more, but frees up more data to be garbage collected which has a net positive effect. The stats below reflect only the first commit, a summary of the two commits combined can be found in this comment: #5075 (comment)

Project Info:

  • Node: v18.17.0
  • Vite v4.4.7
  • Rollup v3.26.3
  • Module Count: 5490 modules transformed
  • Chunk Count: 54 chunks rendered

Before:

  • Average vite build time (5 runs): 30.89 seconds
  • Average peak heap size (5 runs): 3,594.83 MB
  • Time spent in GC: ~13.7 seconds
  • Time spent in decode(): ~3.11 seconds

After:

  • Average vite build time (5 runs): 22.13 seconds
  • Average peak heap size (5 runs): 2,888.2 MB
  • Time spent in GC: ~5.9 seconds
  • Time spent in decode(): ~1.70 seconds

I'm happy to try to write tests for this if it is needed, but as I'm still getting my feet wet in this project that might take me a little while to figure out exactly how to go about it.

I'm also a little curious if there is a higher-level optimization to be applied that would make this no longer necessary. For example, what are the cases where we are calling decodedSourcemap, but then abandoning the result without ever actually accessing the mappings value?

Are we able to avoid call to decodedSourcemap at all and/or any json parsing until we know we will need it to be decoded?

Are there cases where we decode the sourcemap only to pass it back into the SourceMap constructor which then re-encodes it to a string?

While I think those are all good questions to ask, the current solution provides enough benefit and seems relatively low risk, so IMO its worth pursuing on its own.

Allocations Before:
image

Allocations After:
image

@vercel
Copy link

vercel bot commented Jul 27, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
rollup ✅ Ready (Inspect) Visit Preview 💬 Add feedback Aug 3, 2023 9:26am

@thebanjomatic
Copy link
Contributor Author

thebanjomatic commented Jul 27, 2023

@lukastaegert Let me know if you can take a look at this. I started down this path because I was running OOM during a build on CI and I'm hoping this change can make a dent in improving vitejs/vite#2433. It didn't make enough of a difference in my particular use-case, but perhaps it lowers the bar for when this issue occurs.

src/utils/decodedSourcemap.ts Outdated Show resolved Hide resolved
src/utils/decodedSourcemap.ts Outdated Show resolved Hide resolved
src/utils/decodedSourcemap.ts Outdated Show resolved Hide resolved
src/utils/decodedSourcemap.ts Outdated Show resolved Hide resolved
@thebanjomatic
Copy link
Contributor Author

thebanjomatic commented Jul 28, 2023

Review has been updated based on feedback.

Adding on another datapoint, I tested: https://github.com/Axeldeblen/realworld-big-build-vite-4

The default configuration doesn't build sourcemaps, but we can still see some improvements
Before: ✓ built in 2m 9s

{
  stats: {
    peakMemory: 4_148_762_536,
    decodedSourcemapCalls: 16133,
    decodeCalls: 16133
  }
}

After: ✓ built in 1m 59s

{
  stats: {
    peakMemory: 3_926_596_880,
    decodedSourcemapCalls: 16133,
    decodeCalls: 0,
    memoizedCalls: 0
  }
}

Unfortunately, if I enable sourcemaps in the config, its a bit more of a mixed bag. For this configuration we only really save 1 call to decode(), and otherwise just introduce some overhead to each access of .mapping. That being said, this project isn't really consuming any dependencies with sourcemaps and is mostly javascript+svelte, so its not the best example of where this sort of change would shine, but it is disappointing to see that there isn't much improvement for this particular case (it's hard to tell as there was a lot of variablility between runs).

Before: ✓ built in 2m 17s - 2m 57s

{
  stats: {
    peakMemory: 4_149_991_032,
    decodedSourcemapCalls: 16139,
    decodeCalls: 16139
  }
}

After: ✓ built in 2m 20s - 2m32s

{
  stats: {
    peakMemory: 4_150_411_624,
    decodedSourcemapCalls: 16139,
    decodeCalls: 16138,
    memoizedCalls: 14130,
  }
}

@thebanjomatic
Copy link
Contributor Author

thebanjomatic commented Jul 29, 2023

@lukastaegert I have iterated on this idea a little further and have some additional gains. For a complete description, see the second commit: 4275894

This change improves on the peak memory usage. I did some binary searching in both my internal enterprise app, and also in https://github.com/Axeldeblen/realworld-big-build-vite-4 to identify the lowest --max-old-space-size value (to the nearest 50MB) that still builds successfully.

Internal Vue app:

  • master: 3550 MB (29.22s average)
  • prior commit: 2650 MB (21.17s average)
  • latest commit: 2250 MB (20.87s average)

realworld-big-build-vite-4 (+ sourcemaps: true):

  • master: 4100 MB (~137s average)
  • prior commit: 4100 MB (~136s average)
  • latest commit: 3900 MB (~128s average)

realworld-big-build-vite-4 (default config w/ sourcemaps: false):

  • master: 4000 MB (~133s average)
  • prior commit: 3850 MB (~118s average)
  • latest commit: 3850 MB (~118s average)

@thebanjomatic
Copy link
Contributor Author

thebanjomatic commented Jul 29, 2023

The only minor issue I have with this PR that could possibly be solved by a breaking change is that the RollupCache type contains ExistingDecodedSourceMap and DecodedSourceMapOrMissing[]. I know rollup doesn't currently persist the cache to disk, but this could be problematic because serializing the sourcemaps with JSON.stringify() will re-hydrate the decoded array in the process. This is a little undesirable both from a cache file-size perspective and it seems like it would be more natural to store these as ExistingRawSourceMap in the cache, and then construct a decodedSourcemap from the cached module for things like watch builds, etc. Having sourcemaps stored in encoded form in the cache would force you to call decodedSourcemap to get the appropriate types, so it makes it easier to write the correct code.

src/rollup/types.d.ts Show resolved Hide resolved
src/utils/decodedSourcemap.ts Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Jul 30, 2023

Codecov Report

Merging #5075 (8403257) into master (6787f44) will decrease coverage by 0.03%.
Report is 1 commits behind head on master.
The diff coverage is 94.28%.

@@            Coverage Diff             @@
##           master    #5075      +/-   ##
==========================================
- Coverage   98.97%   98.95%   -0.03%     
==========================================
  Files         226      226              
  Lines        8362     8390      +28     
  Branches     2294     2302       +8     
==========================================
+ Hits         8276     8302      +26     
  Misses         31       31              
- Partials       55       57       +2     
Files Changed Coverage Δ
src/utils/decodedSourcemap.ts 93.54% <92.30%> (+7.83%) ⬆️
src/Module.ts 100.00% <100.00%> (ø)
src/utils/collapseSourcemaps.ts 90.62% <100.00%> (+0.19%) ⬆️
src/utils/getOriginalLocation.ts 93.33% <100.00%> (-6.67%) ⬇️

Copy link
Member

@lukastaegert lukastaegert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. If you are fine from your side, I will merge this tomorrow.

@thebanjomatic
Copy link
Contributor Author

thebanjomatic commented Aug 2, 2023

Thanks for the review @lukastaegert I have pushed a commit dropping the setter. This is good to merge / release from my end when you are ready.

Adam Hines added 3 commits August 2, 2023 08:35
By deferring the decoding of the sourcemap mappings field, we can avoid
extra computation and garbage generation, both of which appear to save
a significant amount of time.

In my test application (using vite), I saw a reduction in total build
time of about 28.4%, and a reduction in peak heap size of 19.6% (706 MB)

Project Info:
* Node: v18.17.0
* Vite v4.4.7
* Rollup v3.26.3
* Module Count: 5490 modules transformed
* Chunk Count: 54 chunks rendered

Before:
* Average vite build time (5 runs): 30.89 seconds
* Average peak heap size (5 runs): 3594.83 MB
* Time spent in GC: ~13.7 seconds
* Time spent in `decode()`: 3.11 seconds

After:
* Average vite build time (5 runs): 22.13 seconds
* Average peak heap size (5 runs): 2888.2 MB
* Time spent in GC: ~5.9 seconds
* Time spent in `decode()`: 1.70 seconds
With the previous commit, we defer decoding the mappings field until we
know they are going to be used. This change aims to further reduce peak
heap usage by allowing for the decoded arrays to be cleaned up sooner
instead of retained until the end of the build.

When we are likely to be done with the `originalSourcemap` and
`sourcemapChain`, we can re-encode them which allows for the larger
decoded array to be garbage collected while retaining the smaller VLQ
encoded strings instead. This commit also makes a change to the previous
commit's caching behavior in that the original encoded mapping is no
longer being held onto once we have a decoded array (this is
closer to the original behavior pr and reduces memory usage further).
This does mean, however, that we need need to recalculate the encoded
string from the decoded arrays instead of just dropping the decoded
array reference. Somewhat counter-intuitively, I found re-encoding to be
both faster (2%) and use less memory than the variant that kept the
original encoded string around.

There are two places where we are currently reverting from decoded ->
encoded mappings:
1) At module.setSource which is passed sourcemaps from transform or from
the rollup cache
2) As part of collapse/folding the module sourcemaps during renderChunk

For the transform case, that operation will frequently be a noop since
the decoding is now lazy and would not have been invoked unless a plugin
called `getCombinedSourcemap()` explicitly. If that were the case, the
combined sourcemap would have its decoded mappings thrown away before
the sourcemaps are stored on the module resulting in them potentially
being decoded a second time when the chunk is rendered.

The second case, renderChunk, appears to make up the majority of the
memory usage reduction for me. It is often times during rendering chunks
that I hit OOM errors when building with lower `--max-old-space-size`
settings. By aggressively removing the cached decoded mappings arrays
when we are done with them, it gives more headroom for the build to
succeed. This effect is particularly noticable when working with large
sourcemaps.
Removing the setter for the mappings field reduces complexity of the
cache behavior somewhat, and increases test coverage of the feature.

We expect the cases where someone might be modifying mappings on an
existing sourcemap to be rare. I think a user would need to hook into
the cache and modify it there. For anyone doing that, this would now
fail because the property is not writeableso I have modified the type to
reflect this readonly nature.
The workaround for any users that encounter problems here is to just
replace the entire sourcemap.

```ts
module.originalSourceMap = {
  ...module.originalSourceMap,
  mappings: myNewMappings
};
```
@lukastaegert lukastaegert merged commit a1be53f into rollup:master Aug 3, 2023
10 of 12 checks passed
@rollup-bot
Copy link
Collaborator

This PR has been released as part of rollup@3.27.1. You can test it via npm install rollup.

thebanjomatic pushed a commit to thebanjomatic/vite that referenced this pull request Aug 3, 2023
Technically this is also covered by a fresh install / lock-file maintenance, but I thought it might be good to call it out with a specific change-log notification.

On my own projects, I've seen around 28% faster builds, with a 1.3 GB (36.6%) reduction in peak memory usage. Which should hopefully help with vitejs#2433

Rollup 3.27.1 contains a change (rollup/rollup#5075) that improves sourcemap handling for rollup builds.

These improvements seem to have the most impact where there is large amounts of treeshaking since sourcemaps are no longer being decoded in memory for files that are tree-shaken from the final bundle, but it also releases those decoded mappings arrays when they are no longer needed which frees up more memory during render chunks.

It also reduces memory usage and build times for projects that don't set `build.sourcemaps: true` in their vite config. For those projects, any sourcemaps loaded from dependencies in node_modules, etc, were being decoded proactively in the event rollup needed to trace the sourcemap to find the locations for any warnings/errors. Now that process only happens if they are needed, and a fair amount of processing and allocations can be avoided.
@thebanjomatic thebanjomatic deleted the perf/lazy-sourcemap-decode branch August 3, 2023 15:04
@rollup-bot
Copy link
Collaborator

This PR has been released as part of rollup@3.27.2. You can test it via npm install rollup.

lukastaegert added a commit that referenced this pull request Aug 4, 2023
thebanjomatic added a commit to thebanjomatic/rollup that referenced this pull request Aug 4, 2023
lukastaegert pushed a commit that referenced this pull request Aug 9, 2023
… mappings (#5087)

* perf(sourcemaps): add back sourcemap lazy decode

Adding back the improvements from #5075

* fix: handle mappings: null or undefined
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants