fix: race conditions in data store save #13372

ascorbic · 2025-03-06T17:03:56Z

Changes

This fixes a few race condition bugs that manifested in sites with lots of very large markdown files.

Adds a method to the data store object that returns a promise that resolves when saving to disk is complete.

Resets the #dirty flag before saving to disk. The main cause of #13310 was that debounced saves that happened while the large data store file was being written to disk would be lost because the dirty flag was cleared once the file write was complete, missing the fact that more more entries need writing.

Fixes #13310

Testing

Tested with the #13310 repro

Docs

changeset-bot · 2025-03-06T17:04:02Z

🦋 Changeset detected

Latest commit: 8739a71

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

codspeed-hq · 2025-03-06T17:16:09Z

CodSpeed Performance Report

Merging #13372 will not alter performance

_{Comparing data-save-race (8739a71) with main (d83f92a)}

Summary

✅ 6 untouched benchmarks

ascorbic · 2025-03-06T17:31:53Z

Looks like this must be causing a slowdown, so there's a timeout. Setting back to draft

justin5267 · 2025-03-07T12:28:22Z

Looks like this must be causing a slowdown, so there's a timeout. Setting back to draft

Thank you for your hard work! The issue of the long synchronization time during the first run had already been reported in issue #13050 . After using caching in subsequent runs, the time became acceptable, so I proactively closed the issue. I truly appreciate your efforts!

justin5267 · 2025-03-10T08:22:57Z

Hello, @ascorbic I tested using https://github.com/justin5267/test and found that this PR @ did not resolve the issue. After running pnpm run dev, the project starts normally, but the data-store.json file is incomplete, with a size of only about 190MB, and some pages are missing.

As a comparison, modifying "SAVE_DEBOUNCE_MS" from 500 to 5000 and running pnpm run dev again resolves the issue. All pages generate correctly, and the size of data-store.json increases to approximately 540MB.

pkg-pr-new · 2025-03-10T08:42:27Z

astro

npm i https://pkg.pr.new/astro@13372

@astrojs/cloudflare

npm i https://pkg.pr.new/@astrojs/cloudflare@13372

@astrojs/netlify

npm i https://pkg.pr.new/@astrojs/netlify@13372

@astrojs/node

npm i https://pkg.pr.new/@astrojs/node@13372

@astrojs/vercel

npm i https://pkg.pr.new/@astrojs/vercel@13372

commit: 8739a71

ematipico · 2025-03-10T08:45:02Z

I just tested the preview release against the reproduction provided by @justin5267, and the project is stuck at syncing content for a while...

justin5267 · 2025-03-10T08:54:03Z

I just tested the preview release against the reproduction provided by @justin5267, and the project is stuck at syncing content for a while...

Well, it's not actually stuck; it just takes a long time to synchronize all the content on the first startup, around 200,000ms. However, after caching, subsequent startups are much faster. I previously submitted a separate issue for this (#13050).

ematipico · 2025-03-10T09:04:55Z

Can we improve the DX in the console? Since the log goes to a newline and nothing gets written, I thought everything was stuck 😓

ascorbic · 2025-03-10T09:29:09Z

@ematipico something like printing "this may take some time", or periodically printing an update? Currently I don't think have the ability to display a progress indicator with our logger, do we?

ascorbic · 2025-03-10T09:30:14Z

@justin5267 hmm. It did fix it for me. Let me look again and see.

ematipico · 2025-03-10T09:48:31Z

@ematipico something like printing "this may take some time", or periodically printing an update? Currently I don't think have the ability to display a progress indicator with our logger, do we?

I think we have the yocto-spinner somewhere, but I don't remember where we use it (create-astro?). We can go with a simpler solution IMHO. When logging, it's possible to pass a boolean argument to not add a newline at the end of the log. We could printing the same phrase with three dots e.g. ., .., ..., ., .., etc.

justin5267 · 2025-03-10T09:54:43Z

@justin5267 hmm. It did fix it for me. Let me look again and see.

I have occasionally encountered a similar issue, where the first time I run pnpm run dev, all pages are generated correctly, but after modifying the content or deleting the .astro folder, running it again results in incomplete generation. Could you try deleting the .astro folder and then rerun pnpm run dev to check if the /docs/w1-1h.html page is generated correctly, or if the data-store.json file is around 540MB+ in size? If both answers are yes, it would indicate that the issue is resolved.

ematipico · 2025-03-10T10:29:10Z

but after modifying the content

I modify the content, and it's correctly updated without issues.

or deleting the .astro folder

That's not a standard user flow, actually. Users shouldn't delete the cache folder because Astro is in charge of it. Not sure if we should account for such cases.

justin5267 · 2025-03-10T10:46:16Z

it's correctly updated without issues.

Could you kindly help check if the http://localhost:54252/docs/w1-1.html page is accessible? When I ran pnpm run dev for testing, the project did not report any errors, but files in the folders starting with "w," "x," "y," and "z" were not generated, resulting in a 404 error when accessed. The same project was able to generate these pages correctly in Astro v4.

justin5267 · 2025-03-10T10:50:12Z

The same project was able to generate these pages correctly in Astro v4

Even in Astro v5, if I modify SAVE_DEBOUNCE_MS to 5000 and run pnpm run dev again, these pages are generated correctly and can be accessed without issues.

ascorbic · 2025-03-10T11:40:29Z

I'd ideally want to avoid needing to change the debounce, because it's working-around something that probably needs a proper fix.

ematipico

I'll be honest; I really don't know what we are fixing, even with the explanation 😅

It would be nice to have some comments that explain the new properties and the new code, if possible.

Otherwise, I tested it, and it works as expected, except for the deletion .astro folder, which I think is something that users shouldn't usually do. They should use --force instead.

justin5267 · 2025-03-11T11:56:02Z

According to @ematipico 's suggestion, I retested using pnpm astro dev --force and found that this PR has fixed the issue. In comparison, without applying this PR, even running pnpm astro dev --force fails to generate the files completely. Therefore, this PR is effective. Thank you for your hard work!

ascorbic · 2025-03-11T12:12:56Z

Thanks for the update. I'm still not totally happy with the situation, but I think fixing it is going to require a bigger refactor, probably to implement a state machine.

fix: race conditions in data store save

Loading
Loading status checks…

668d645

github-actions bot added the pkg: astro label Mar 6, 2025

ascorbic marked this pull request as draft March 6, 2025 17:31

Whitespace change to kick CI

Loading
Loading status checks…

0491e18

ascorbic self-assigned this Mar 7, 2025

ascorbic added 2 commits March 7, 2025 13:11

Flsuh saves to disk

269389f

Remove log

Loading
Loading status checks…

8739a71

ascorbic marked this pull request as ready for review March 7, 2025 14:20

ematipico added the pr preview label Mar 10, 2025

ematipico approved these changes Mar 11, 2025

View reviewed changes

ascorbic merged commit 7783dbf into main Mar 18, 2025
29 of 30 checks passed

ascorbic deleted the data-save-race branch March 18, 2025 09:24

astrobot-houston mentioned this pull request Mar 18, 2025

[ci] release #13427

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

fix: race conditions in data store save #13372

fix: race conditions in data store save #13372

ascorbic commented Mar 6, 2025

changeset-bot bot commented Mar 6, 2025 •

edited

Loading

codspeed-hq bot commented Mar 6, 2025 •

edited

Loading

ascorbic commented Mar 6, 2025

justin5267 commented Mar 7, 2025

justin5267 commented Mar 10, 2025

pkg-pr-new bot commented Mar 10, 2025

ematipico commented Mar 10, 2025

justin5267 commented Mar 10, 2025 •

edited

Loading

ematipico commented Mar 10, 2025

ascorbic commented Mar 10, 2025

ascorbic commented Mar 10, 2025

ematipico commented Mar 10, 2025

justin5267 commented Mar 10, 2025 •

edited

Loading

ematipico commented Mar 10, 2025

justin5267 commented Mar 10, 2025

justin5267 commented Mar 10, 2025

ascorbic commented Mar 10, 2025

ematipico left a comment

justin5267 commented Mar 11, 2025

ascorbic commented Mar 11, 2025

fix: race conditions in data store save #13372

fix: race conditions in data store save #13372

Conversation

ascorbic commented Mar 6, 2025

Changes

Testing

Docs

changeset-bot bot commented Mar 6, 2025 • edited Loading

🦋 Changeset detected

codspeed-hq bot commented Mar 6, 2025 • edited Loading

CodSpeed Performance Report

Merging #13372 will not alter performance

Summary

ascorbic commented Mar 6, 2025

justin5267 commented Mar 7, 2025

justin5267 commented Mar 10, 2025

pkg-pr-new bot commented Mar 10, 2025

ematipico commented Mar 10, 2025

justin5267 commented Mar 10, 2025 • edited Loading

ematipico commented Mar 10, 2025

ascorbic commented Mar 10, 2025

ascorbic commented Mar 10, 2025

ematipico commented Mar 10, 2025

justin5267 commented Mar 10, 2025 • edited Loading

ematipico commented Mar 10, 2025

justin5267 commented Mar 10, 2025

justin5267 commented Mar 10, 2025

ascorbic commented Mar 10, 2025

ematipico left a comment

Choose a reason for hiding this comment

justin5267 commented Mar 11, 2025

ascorbic commented Mar 11, 2025

changeset-bot bot commented Mar 6, 2025 •

edited

Loading

codspeed-hq bot commented Mar 6, 2025 •

edited

Loading

justin5267 commented Mar 10, 2025 •

edited

Loading

justin5267 commented Mar 10, 2025 •

edited

Loading