-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: ensure background work is finished when response has 3xx or 5xx status code #2742
Conversation
📊 Package size report -0%↓
Unchanged files
🤖 This report was automatically generated by pkg-size-action |
fe45660
to
62ed00c
Compare
await page.goto(new URL(`always-the-same-body/${slug}`, pageRouter.url).href) | ||
|
||
await new Promise((resolve) => setTimeout(resolve, 15_000)) | ||
|
||
await page.goto(new URL(`always-the-same-body/${slug}`, pageRouter.url).href) | ||
|
||
await new Promise((resolve) => setTimeout(resolve, 15_000)) | ||
|
||
await page.goto(new URL(`always-the-same-body/${slug}`, pageRouter.url).href) | ||
|
||
await new Promise((resolve) => setTimeout(resolve, 15_000)) | ||
|
||
// keep lambda executing to allow for background getStaticProps to finish in case background work execution was suspended | ||
await fetch(new URL(`api/sleep-5`, pageRouter.url).href) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit unclear to me why I need to make 3 requests, but this whole thing result in following logs that I've added to getStaticProps
function that is being tested with those routes (+ some commentary I added to hopefully explain the logs):
# first request - this will always be very first request for this path, because it's
# not prerendered - this has `fallback: 'blocking'` so very first request is
# expected to execute `getStaticProps` in foreground and not background
Jan 16, 09:10:57 PM: 4f18552f INFO [timestamp] 1737058256930 getStaticProps start
Jan 16, 09:10:57 PM: 4f18552f INFO [timestamp] 1737058256930 getStaticProps start stored
Jan 16, 09:11:02 PM: 4f18552f INFO [timestamp] 1737058256930 getStaticProps end (duration: 5s)
Jan 16, 09:11:02 PM: 4f18552f INFO [timestamp] 1737058256930 getStaticProps end stored
Jan 16, 09:11:02 PM: 4f18552f Duration: 5136.12 ms Memory Usage: 177 MB
# second request 15s after first one finished - this should generate exactly
# same thing - however response is 200 not 304 🤷
Jan 16, 09:11:17 PM: 21a21fc8 INFO [timestamp] 1737058256930 getStaticProps start
Jan 16, 09:11:18 PM: 21a21fc8 INFO [timestamp] 1737058256930 getStaticProps start stored
Jan 16, 09:11:22 PM: 21a21fc8 INFO [timestamp] 1737058256930 getStaticProps end (duration: 5s)
Jan 16, 09:11:23 PM: 21a21fc8 INFO [timestamp] 1737058256930 getStaticProps end stored
Jan 16, 09:11:23 PM: 21a21fc8 Duration: 5119.96 ms Memory Usage: 177 MB
# third request 15s after second one finished (but only 10 seconds after
# getStaticProps triggered by second request finished - that's because
# it was executing in background) - we can see that we DO start executing
# getStaticProps - but it doesn't finish before lambda execution finishes/suspends
# this one is 304 response (which is currently problematic one)
Jan 16, 09:11:33 PM: 13effc77 INFO [timestamp] 1737058256930 getStaticProps start
Jan 16, 09:11:33 PM: 13effc77 Duration: 63.11 ms Memory Usage: 177 MB
# now we were just sleeping for 15 seconds and after we are hitting
# an api endpoint that is being handled by same next.js server handler
# which allows to resume previously suspended execution - you can also
# see it has same execution context as previous request (13effc77) despite
# lambda execution seemingly was finished (log line with Duration and
# Memory Usage signifies end of execution) - you can see that now
# logged duration inside getStaticProps says 15s instead of expected ~5s
Jan 16, 09:11:48 PM: 13effc77 INFO [timestamp] 1737058256930 getStaticProps end (duration: 15.722s)
Jan 16, 09:11:48 PM: 13effc77 INFO [timestamp] 1737058256930 getStaticProps start stored
Jan 16, 09:11:48 PM: 4974d6c2 INFO [timestamp] 1737058256930 getStaticProps end stored
Jan 16, 09:11:53 PM: 4974d6c2 Duration: 5009.36 ms Memory Usage: 177 MB
2536087
to
3fd8d77
Compare
src/run/handlers/server.ts
Outdated
// Temporary workaround for an issue where sending a response with an empty | ||
// body causes an unhandled error. This doesn't catch everything, but redirects are the | ||
// most common case of sending empty bodies. We can't check it directly because these are streams. | ||
// The side effect is that responses which do contain data will not be streamed to the client, | ||
// but that's fine for redirects. | ||
// TODO: Remove once a fix has been rolled out. | ||
if ((response.status > 300 && response.status < 400) || response.status >= 500) { | ||
const body = await response.text() | ||
return new Response(body || null, response) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally should be able to remove this because there was fix rolled out ( https://github.com/netlify/serverless-functions-api/pull/203 )
But this doesn't help fix the issue with 304 cases not ensuring background work execution because
opennextjs-netlify/src/run/handlers/server.ts
Line 133 in 3fd8d77
return new Response(response.body?.pipeThrough(keepOpenUntilNextFullyRendered), response) |
response.body
is null
for 304s (?) so we still don't use keepOpenUntilNextFullyRendered
We also can't force non-null body for those because that seems not allowed:
> new Response('', { status: 304 })
VM957:1 Uncaught TypeError: Failed to construct 'Response': Response with null body status cannot have body
at <anonymous>:1:1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this might work?
if ((response.status > 300 && response.status < 400) || response.status >= 500) {
const body = await response.text()
// Create an empty stream for null bodies, otherwise create a stream from the body text
const stream = body
? new TextEncoder().encode(body).stream()
: new ReadableStream({ start(controller) { controller.close() } });
return new Response(stream.pipeThrough(keepOpenUntilNextFullyRendered), response)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
> new Response(new ReadableStream({ start(controller) { controller.close() } }), { status: 304 })
VM447:1 Uncaught TypeError: Failed to construct 'Response': Response with null body status cannot have body
at <anonymous>:1:1
this is pretty much same problem - for 304
body
has to be null
- it can't be empty stream or string :/
I did try few alternatives similar to this before, but it all was resulting in crashes when 304 was produced and body
essentially was not null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. I might be misunderstanding this, but could we abstract the background work completion and do it both places. Something like:
const waitForBackgroundWork = async () => {
await nextHandlerPromise
res.emit('close')
await requestContext.backgroundWorkPromise
}
if ((response.status > 300 && response.status < 400) || response.status >= 500) {
const body = await response.text()
await waitForBackgroundWork()
return new Response(body || null, response)
}
const keepOpenUntilNextFullyRendered = new TransformStream({
async flush() {
await waitForBackgroundWork()
},
})
return new Response(response.body?.pipeThrough(keepOpenUntilNextFullyRendered), response)
It obviously won't stream for those status codes, but that shouldn't matter too much?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We certainly can do that and this would solve the issue with background work being suspended, but it will completely block response until background work finishes for those 304
cases
This feels like worst of both worlds - because it would still serve stale - but it only would do that after background work to generate new response would finish, but maybe this is just what we need to do :/
It feels like if we are blocking on background work - it would be better to just serve freshly generated response instead - but a) this kind of break expectations of SWR behavior and b) it's not obvious how to exactly do that without completely disabling SWR (we can do that with CacheHandler.get
returning null
for stale entries - but if we would want to do that only for 304 cases - at the point we could force it - we have no idea if 304 would be produced)
The ideal solution is having context.waitUntil
work, but this is not rolled out yet (not sure of the status).
Another somewhat hacky idea - initiate new request to same lambda instance (not sure how to ensure that) with specific payload that next-runtime would intercept and all it would do is to wait for any background work initiated by previous invocations to finish and wouldn't go through next-server at all - kind of keeping lambda warm with "meaningless" request - basically doing similar as I did here for the test:
opennextjs-netlify/tests/e2e/page-router.test.ts
Lines 516 to 517 in 3fd8d77
// keep lambda executing to allow for background getStaticProps to finish in case background work execution was suspended | |
await fetch(new URL(`api/sleep-5`, pageRouter.url).href) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now I'll make the change you suggested, to at least have a working fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any better ideas so I think just awaiting for background work before responding with body
-less response will have to do here - done that in last commit - c8e103b?diff=split&w=1
…status code
3fd8d77
to
c8e103b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that all looks good. At first I thought we needed an early return for responses without a body, but I see now that optional chaining will prevent attempting to pipe through in that case, so yep - all good!
Description
We are skipping code path that is ensuring background work finishes before lambda suspends execution if the response is 3xx or 5xx currently. It seems like most notable problem happens with 304 respones - I didn't have much luck reproducing it with redirects, but I managed to do so with 304
Tests
Added failing test without any code changes first https://github.com/opennextjs/opennextjs-netlify/actions/runs/12816826577/job/35738662910?pr=2742#step:13:122
Relevant links (GitHub issues, etc.) or a picture of cute animal