Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Starting at 13.4.13-canary.0 Internal Server Error due to connection refused #53171

Closed
1 task done
joacub opened this issue Jul 25, 2023 · 57 comments · Fixed by #53131
Closed
1 task done

Starting at 13.4.13-canary.0 Internal Server Error due to connection refused #53171

joacub opened this issue Jul 25, 2023 · 57 comments · Fixed by #53131
Labels
bug Issue was opened via the bug report template. linear: next Confirmed issue that is tracked by the Next.js team. locked

Comments

@joacub
Copy link

joacub commented Jul 25, 2023

Verify canary release

  • I verified that the issue exists in the latest Next.js canary release

Provide environment information

13.4.13-canary.0

Which area(s) of Next.js are affected? (leave empty if unsure)

No response

Link to the code that reproduces this issue or a replay of the bug

no response

To Reproduce

just update to this version and get up a production server

Describe the Bug

  • error Failed to handle request for /sw.js
    TypeError: fetch failed
    at Object.fetch (node:internal/deps/undici/undici:11576:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async invokeRequest (/app/node_modules/.pnpm/next@13.4.13-canary.0_@babel+core@7.22.5_react-dom@18.2.0_react@18.2.0_sass@1.64.0/node_modules/next/dist/server/lib/server-ipc/invoke-request.js:34:23)
    at async requestHandler (/app/node_modules/.pnpm/next@13.4.13-canary.0_@babel+core@7.22.5_react-dom@18.2.0_react@18.2.0_sass@1.64.0/node_modules/next/dist/server/lib/start-server.js:329:35)
    at async Server. (/app/node_modules/.pnpm/next@13.4.13-canary.0_@babel+core@7.22.5_react-dom@18.2.0_react@18.2.0_sass@1.64.0/node_modules/next/dist/server/lib/start-server.js:148:13) {
    cause: Error: connect ECONNREFUSED 127.0.0.1:41367
    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1592:16) {
    errno: -111,
    code: 'ECONNREFUSED',
    syscall: 'connect',
    address: '127.0.0.1',
    port: 41367
    }
    }

Expected Behavior

Server Working

Which browser are you using? (if relevant)

No response

How are you deploying your application? (if relevant)

No response

NEXT-1510

@joacub joacub added the bug Issue was opened via the bug report template. label Jul 25, 2023
@joacub joacub changed the title Starting at 13.4.13-canary.0 Starting at 13.4.13-canary.0 Internal Server Error dues to connection refused Jul 25, 2023
@joacub joacub changed the title Starting at 13.4.13-canary.0 Internal Server Error dues to connection refused Starting at 13.4.13-canary.0 Internal Server Error due to connection refused Jul 25, 2023
@DuCanhGH
Copy link
Contributor

Can you try adding --hostname 127.0.0.1 or HOSTNAME=127.0.0.1? It is possible that localhost on your machine was resolved to [::1]. My PR #53131 tries to fix some issues with IPv6, but it hasn't got any attention yet...

@joacub
Copy link
Author

joacub commented Jul 25, 2023

im looking at the docs in how to pass that var to production server in standalone mode:

CMD node apps/web/server.js

do you know how?

@balazsorban44
Copy link
Member

This is the second time I'm letting you know, please follow the bug report template when opening an issue. #52621 (comment)

Please add a minimal reproduction so we can investigate.

@balazsorban44 balazsorban44 added the please add a complete reproduction The issue lacks information for further investigation label Jul 25, 2023
@github-actions
Copy link
Contributor

We cannot recreate the issue with the provided information. Please add a reproduction in order for us to be able to investigate.

Why was this issue marked with the please add a complete reproduction label?

To be able to investigate, we need access to a reproduction to identify what triggered the issue. We prefer a link to a public GitHub repository (template for pages, template for App Router), but you can also use these templates: CodeSandbox: pages or CodeSandbox: App Router.

To make sure the issue is resolved as quickly as possible, please make sure that the reproduction is as minimal as possible. This means that you should remove unnecessary code, files, and dependencies that do not contribute to the issue. Ensure your reproduction does not depend on secrets, 3rd party registries, private dependencies, or any other data that cannot be made public. Avoid a reproduction including a whole monorepo (unless relevant to the issue). The easier it is to reproduce the issue, the quicker we can help.

Please test your reproduction against the latest version of Next.js (next@canary) to make sure your issue has not already been fixed.

If you cannot create a clean reproduction, another way you can help the maintainers' job is to pinpoint the canary version of next that introduced the issue. Check out our releases, and try to find the first canary release that introduced the issue. This will help us narrow down the scope of the issue, and possibly point to the PR/code change that introduced it. You can install a specific version of next by running npm install next@<version>.

I added a link, why was it still marked?

Ensure the link is pointing to a codebase that is accessible (e.g. not a private repository). "example.com", "n/a", "will add later", etc. are not acceptable links -- we need to see a public codebase. See the above section for accepted links.

What happens if I don't provide a sufficient minimal reproduction?

Issues with the please add a complete reproduction label that receives no meaningful activity (e.g. new comments with a reproduction link) are automatically closed and locked after 30 days.

If your issue has not been resolved in that time and it has been closed/locked, please open a new issue with the required reproduction.

I did not open this issue, but it is relevant to me, what can I do to help?

Anyone experiencing the same issue is welcome to provide a minimal reproduction following the above steps. Furthermore, you can upvote the issue using the 👍 reaction on the topmost comment (please do not comment "I have the same issue" without reproduction steps). Then, we can sort issues by votes to prioritize.

I think my reproduction is good enough, why aren't you looking into it quicker?

We look into every Next.js issue and constantly monitor open issues for new comments.

However, sometimes we might miss one or two due to the popularity/high traffic of the repository. We apologize, and kindly ask you to refrain from tagging core maintainers, as that will usually not result in increased priority.

Upvoting issues to show your interest will help us prioritize and address them as quickly as possible. That said, every issue is important to us, and if an issue gets closed by accident, we encourage you to open a new one linking to the old issue and we will look into it.

Useful Resources

@joacub
Copy link
Author

joacub commented Jul 25, 2023

This is the second time I'm letting you know, please follow the bug report template when opening an issue. #52621 (comment)

Please add a minimal reproduction so we can investigate.

Please, allow me to explain the situation more clearly. I have encountered a recurring issue with the server not starting in production. The problem does not have any clear reproduction steps, and it happens consistently. It's important to note that the issue is not caused by anything on my end, as I've tested it on a completely empty Next.js project, and the problem persists.

This problem was first noticed by another person as well, and we have been trying to pinpoint the specific code that triggers this error. Despite our efforts, the root cause remains elusive. While I understand that you may want more information from me, I assure you that we have thoroughly investigated the situation.

I would appreciate any guidance or assistance you can provide to resolve this matter. As it is affecting the production environment, I'm eager to find a solution as soon as possible. Please let me know if you need any further details or if there's anything specific you would like me to do to help with the investigation and resolution of this issue. Thank you for your support

@DuCanhGH
Copy link
Contributor

@joacub HOSTNAME=127.0.0.1 node apps/web/server.js.

@joacub
Copy link
Author

joacub commented Jul 26, 2023

@joacub HOSTNAME=127.0.0.1 node apps/web/server.js.

Thank you so much I didn’t thought they were using that by another config, thank you so much

@joacub
Copy link
Author

joacub commented Jul 26, 2023

if i do that nothin works, maybe becosue is in docker i will when have time investigate how to solve it thanks

@joacub
Copy link
Author

joacub commented Jul 26, 2023

image

@joacub
Copy link
Author

joacub commented Jul 26, 2023

i guess this is the cause of this ===>
#53004

this is frustrating, every single day is a new issue, this is critical...

this is only happening in production when doing a NextResponse.rewrite

@balazsorban44
Copy link
Member

Please read #53171 (comment) and #53171 (comment)

@joacub
Copy link
Author

joacub commented Jul 27, 2023

Error finally detected! As mentioned by @balazsorban44, it's an IPv6 handling error. If you force the use of 0.0.0.0 in Docker environments because you can't probably use 127.0.0.1, it will cause issues when you invoke the rewrite like this:

return NextResponse.rewrite(
        new URL(
          url,
          request.url,
        ),
        getResponseInit(),
      );

The next call will invoke the middleware again with the new request, which is not intended. To avoid this, modify it to:

return NextResponse.rewrite(
         // this is to try to not called the middleware twice remove when next handle this better
        new URL(
          url,
          request.url
            .replace('https://localhost', 'https://0.0.0.0')
            .replace('http://localhost', 'http://0.0.0.0'),
        ),
        getResponseInit(),
      );

The middleware is not called twice because Next.js detects it as the same request and simply re-routes it.

@charnog
Copy link

charnog commented Jul 28, 2023

Same problem here. It started with 13.4.13-canary.0, and I can also see it in 13.4.13-canary.5. I've been trying to figure it out on my local machine using Docker, but no luck. I can't really pinpoint what's causing it. We don't have rewrites in the middleware (as @joacub mentioned), for most of the requests, NextResponse.next() is executed, but we do have rewrites in the config. It could somehow be related.

I know a small example would help, but like I said, I can't narrow it down yet. Still trying to reproduce it locally.

Anyway, here are the logs from 13.4.13-canary.5 from our pods in the Kubernetes cluster. Maybe they'll help somehow.

TypeError: fetch failed
    at Object.fetch (node:internal/deps/undici/undici:11576:11)
    at async invokeRequest (/app/node_modules/next/dist/server/lib/server-ipc/invoke-request.js:21:12)
    at async invokeRender (/app/node_modules/next/dist/server/lib/router-server.js:226:29)
    at async handleRequest (/app/node_modules/next/dist/server/lib/router-server.js:419:24)
    at async requestHandler (/app/node_modules/next/dist/server/lib/router-server.js:436:13) {
  cause: SocketError: other side closed
      at Socket.onSocketEnd (/app/node_modules/next/dist/compiled/undici/index.js:1:63301)
      at Socket.emit (node:events:526:35)
      at endReadableNT (node:internal/streams/readable:1359:12)
      at process.processTicksAndRejections (node:internal/process/task_queues:82:21) {
    code: 'UND_ERR_SOCKET',
    socket: {
      localAddress: '127.0.0.1',
      localPort: 52428,
      remoteAddress: undefined,
      remotePort: undefined,
      remoteFamily: undefined,
      timeout: undefined,
      bytesWritten: 770,
      bytesRead: 207
    }
  }
}
TypeError: fetch failed
    at Object.fetch (node:internal/deps/undici/undici:11576:11)
    at async invokeRequest (/app/node_modules/next/dist/server/lib/server-ipc/invoke-request.js:21:12)
    at async invokeRender (/app/node_modules/next/dist/server/lib/router-server.js:226:29)
    at async handleRequest (/app/node_modules/next/dist/server/lib/router-server.js:419:24)
    at async requestHandler (/app/node_modules/next/dist/server/lib/router-server.js:436:13) {
  cause: Error: read ECONNRESET
      at TCP.onStreamRead (node:internal/stream_base_commons:217:20) {
    errno: -104,
    code: 'ECONNRESET',
    syscall: 'read'
  }
}

@zdarovka
Copy link

zdarovka commented Jul 28, 2023

I had the same issues with 13.4.13-canary.5. Rollback to 13.4.12 solved it for me.

As well as @charnog I do not have any rewrites in the app.

Here is the error from my container (sorry I no longer have it in plaintext)

image

@florianliebig
Copy link

florianliebig commented Jul 28, 2023

We face the same issue beginning from 13.4.13-canary.0 - I tried to quickly reproduce with https://github.com/vercel/next.js/blob/canary/examples/with-docker/README.md and app dir. But it works every time.

We have quite a fancy setup (turborepo, multiple apps, multiple middlewares) but as stated here in the issue it seems to also happen without any of that.

Settings the hostname like in the example: https://github.com/vercel/next.js/blob/canary/examples/with-docker/Dockerfile#L60 didn't work for me

Only thing I can share from the error docker container is the line which throws the error:

2023-07-28 at 17 09

2023-07-28 at 17 10

From https://github.com/vercel/next.js/blob/canary/packages/next/src/server/lib/start-server.ts#L332

Looks a lot like this change c11bce5#diff-0ff576bd02e76f96680accff48ecbe4126a7f6bc4eafa547b6d44dee49bab770R246 from @shuding & @ijjk

@joacub
Copy link
Author

joacub commented Jul 28, 2023

We face the same issue beginning from 13.4.13-canary.0 - I tried to quickly reproduce with https://github.com/vercel/next.js/blob/canary/examples/with-docker/README.md and app dir. But it works every time.

We have quite a fancy setup (turborepo, multiple apps, multiple middlewares) but as stated here in the issue it seems to also happen without any of that.

Settings the hostname like in the example: https://github.com/vercel/next.js/blob/canary/examples/with-docker/Dockerfile#L60 didn't work for me

Only thing I can share from the error docker container is the line which throws the error:

2023-07-28 at 17 09

2023-07-28 at 17 10

From https://github.com/vercel/next.js/blob/canary/packages/next/src/server/lib/start-server.ts#L332

It will work in a normal setup, I also reproduce myself in a my side and works every time, to reproduce this error you need to have the setup in a docker container mapping the port from for example 3035:3000 and have Nginx proxy going to http://container-name:3035 and the error will happen every single time. There are many other errors when you have that setup like the env vars now are not going to the node, seems like nextjs is wiping all env vars since that versión and no env vars are going to the node server

@charnog
Copy link

charnog commented Jul 28, 2023

@florianliebig, try setting it to ENV HOSTNAME 0.0.0.0 instead of ENV HOSTNAME localhost, as @DuCanhGH mentioned before (look at his PR here). It seems that this is a workaround for now. I'm testing on my side and it appears to resolve the issue.

@DuCanhGH
Copy link
Contributor

@charnog yeah they really thought replacing localhost with 127.0.0.1 is the way to go 💀 I still haven't got any response by the way.

@joacub
Copy link
Author

joacub commented Jul 28, 2023

@charnog yeah they really thought replacing localhost with 127.0.0.1 is the way to go 💀 I still haven't got any response by the way.

Yeah the way to fix this is setting hostname as 0.0.0.0 but then other things fails as next seems to be wiping All env vars now

@joacub
Copy link
Author

joacub commented Jul 28, 2023

@charnog yeah they really thought replacing localhost with 127.0.0.1 is the way to go 💀 I still haven't got any response by the way.

Btw great work with next-pwa working really nice.

@joacub
Copy link
Author

joacub commented Jul 28, 2023

@charnog yeah they really thought replacing localhost with 127.0.0.1 is the way to go 💀 I still haven't got any response by the way.

💀

@joacub
Copy link
Author

joacub commented Jul 28, 2023

Same problem here. It started with 13.4.13-canary.0, and I can also see it in 13.4.13-canary.5. I've been trying to figure it out on my local machine using Docker, but no luck. I can't really pinpoint what's causing it. We don't have rewrites in the middleware (as @joacub mentioned), for most of the requests, NextResponse.next() is executed, but we do have rewrites in the config. It could somehow be related.

I know a small example would help, but like I said, I can't narrow it down yet. Still trying to reproduce it locally.

Anyway, here are the logs from 13.4.13-canary.5 from our pods in the Kubernetes cluster. Maybe they'll help somehow.

TypeError: fetch failed
    at Object.fetch (node:internal/deps/undici/undici:11576:11)
    at async invokeRequest (/app/node_modules/next/dist/server/lib/server-ipc/invoke-request.js:21:12)
    at async invokeRender (/app/node_modules/next/dist/server/lib/router-server.js:226:29)
    at async handleRequest (/app/node_modules/next/dist/server/lib/router-server.js:419:24)
    at async requestHandler (/app/node_modules/next/dist/server/lib/router-server.js:436:13) {
  cause: SocketError: other side closed
      at Socket.onSocketEnd (/app/node_modules/next/dist/compiled/undici/index.js:1:63301)
      at Socket.emit (node:events:526:35)
      at endReadableNT (node:internal/streams/readable:1359:12)
      at process.processTicksAndRejections (node:internal/process/task_queues:82:21) {
    code: 'UND_ERR_SOCKET',
    socket: {
      localAddress: '127.0.0.1',
      localPort: 52428,
      remoteAddress: undefined,
      remotePort: undefined,
      remoteFamily: undefined,
      timeout: undefined,
      bytesWritten: 770,
      bytesRead: 207
    }
  }
}
TypeError: fetch failed
    at Object.fetch (node:internal/deps/undici/undici:11576:11)
    at async invokeRequest (/app/node_modules/next/dist/server/lib/server-ipc/invoke-request.js:21:12)
    at async invokeRender (/app/node_modules/next/dist/server/lib/router-server.js:226:29)
    at async handleRequest (/app/node_modules/next/dist/server/lib/router-server.js:419:24)
    at async requestHandler (/app/node_modules/next/dist/server/lib/router-server.js:436:13) {
  cause: Error: read ECONNRESET
      at TCP.onStreamRead (node:internal/stream_base_commons:217:20) {
    errno: -104,
    code: 'ECONNRESET',
    syscall: 'read'
  }
}

Any rewrite made at any point will lead to the same behavior, triggering the re-execution of all processes and consequently altering the entire outcome of the subsequent code execution. This rewrite is perceived as a new request and should be handled differently to ensure the desired results.

@joacub
Copy link
Author

joacub commented Jul 28, 2023

Same problem here. It started with 13.4.13-canary.0, and I can also see it in 13.4.13-canary.5. I've been trying to figure it out on my local machine using Docker, but no luck. I can't really pinpoint what's causing it. We don't have rewrites in the middleware (as @joacub mentioned), for most of the requests, NextResponse.next() is executed, but we do have rewrites in the config. It could somehow be related.
I know a small example would help, but like I said, I can't narrow it down yet. Still trying to reproduce it locally.
Anyway, here are the logs from 13.4.13-canary.5 from our pods in the Kubernetes cluster. Maybe they'll help somehow.

TypeError: fetch failed
    at Object.fetch (node:internal/deps/undici/undici:11576:11)
    at async invokeRequest (/app/node_modules/next/dist/server/lib/server-ipc/invoke-request.js:21:12)
    at async invokeRender (/app/node_modules/next/dist/server/lib/router-server.js:226:29)
    at async handleRequest (/app/node_modules/next/dist/server/lib/router-server.js:419:24)
    at async requestHandler (/app/node_modules/next/dist/server/lib/router-server.js:436:13) {
  cause: SocketError: other side closed
      at Socket.onSocketEnd (/app/node_modules/next/dist/compiled/undici/index.js:1:63301)
      at Socket.emit (node:events:526:35)
      at endReadableNT (node:internal/streams/readable:1359:12)
      at process.processTicksAndRejections (node:internal/process/task_queues:82:21) {
    code: 'UND_ERR_SOCKET',
    socket: {
      localAddress: '127.0.0.1',
      localPort: 52428,
      remoteAddress: undefined,
      remotePort: undefined,
      remoteFamily: undefined,
      timeout: undefined,
      bytesWritten: 770,
      bytesRead: 207
    }
  }
}
TypeError: fetch failed
    at Object.fetch (node:internal/deps/undici/undici:11576:11)
    at async invokeRequest (/app/node_modules/next/dist/server/lib/server-ipc/invoke-request.js:21:12)
    at async invokeRender (/app/node_modules/next/dist/server/lib/router-server.js:226:29)
    at async handleRequest (/app/node_modules/next/dist/server/lib/router-server.js:419:24)
    at async requestHandler (/app/node_modules/next/dist/server/lib/router-server.js:436:13) {
  cause: Error: read ECONNRESET
      at TCP.onStreamRead (node:internal/stream_base_commons:217:20) {
    errno: -104,
    code: 'ECONNRESET',
    syscall: 'read'
  }
}

Any rewrite made at any point will lead to the same behavior, triggering the re-execution of all processes and consequently altering the entire outcome of the subsequent code execution. This rewrite is perceived as a new request and should be handled differently to ensure the desired results.

btw that error you are facing there is not caused by the rewirte, the rewrite is another issue related to this versions becouse you have been force to indicated a HOSTNAME to properly make next working behind proxies and docker, so you need now to indicate in your setup a HOSTNAME 0.0.0.0 so then nextjs is hearing all incoming networks

@schimi-dev
Copy link

schimi-dev commented Aug 7, 2023

For self-hosted apps that use output: standalone Next.js v13.4.13 introduced some weird and really hard to reproduce behaviours. This is an attempt to summarize my experience. Note that all of the things described below worked fine in v13.4.12.

Below is the next.config.js file:

/** @type {import('next').NextConfig} */
const nextConfig = {
    output: 'standalone',
}

module.exports = nextConfig

For reproduction do the following:

  1. Build an app where output: standalone is configured.
  2. For easier reproduction copy the static and public folder as described here: https://nextjs.org/docs/app/api-reference/next-config-js/output#automatically-copying-traced-files
  3. Start the app with node server.js on a server where localhost resolves to an ipv6 address.

In such an environment, the following 2 scenarios fail:

1. Internationalization middleware always redirects to localhost

Reproduction:

  1. Use the example from the Next.js repository: https://github.com/vercel/next.js/tree/canary/examples/app-dir-i18n-routing
  2. Configure output: standalone.
  3. Increase Next.js version to 13.4.13.
  4. Host the app on a differen server (not your local machine) where localhost resolves to an ipv6 address.
  5. Visit the corresponding url in the browser, e.g.: http://my-test-server.mycompanydomain:3000

Result: The internationalization middleware redirects to: http://localhost:3000/en
Expected: The internationalization middleware should redirect to: http://my-test-server.mycompanydomain:3000/en

2. Redirects in Server Actions force a hard reload of the whole page in the browser

When running on a Server that only uses ipv4, Server Actions now work perfectly fine thanks to:
#53373 and #53368 (great work by the way, I was really looking forward to that).

However, when hosted on a Server where localhost resolves to an ipv6 address and a Server Action uses redirect("/some-other-path") an error message is logged and the redirect is enforced via a hard page reload in the browser:

Reproduction:

  1. Use any simple service where Server Actions are enabled.
  2. Call a Server Action via a form's action that does a redirect to another page or the same page (it doesn't matter).

Result: The new page is visited via a hard reload in the web browser and the following error is logged to the console:
redirect_error
Expected: Navigation should not reload the whole page and no error message should be logged.

Additional Info

These were two simple scenarios I was able to reproduce. Moreover, there also seems to be a problem regarding environment variables for output: standalone in 13.4.13: #53579 and #53367

This makes it quite hard to test how aspects like manually setting HOSTANAME=0.0.0.0 or to 127.0.0.1 would work. In my case, none of the two reproduction scenarios described could be solved by this.

One more thing I noticed is that in 13.4.13 starting the server logs the following:
bug_reproduction_13
Note the first line: "ready started server on 0.0.0.0:3000, url: http://localhost:3000"
Is this intended to be that way?

I hope that this can help with troubleshooting. I really think that this is a hard one to reproduce, as problems become present in different aspects of the app, but only in a specific environment.

@schimi-dev
Copy link

Moreover, from what I tested it seems that @joacub is right and that this problem was already introduced in 13.4.13-canary.0, maybe via #53004.

Nevertheless, for the reproduction scenarios I described, I would recommend the Stable Version 13.4.13, because one of the reproduction Scenarios targets Server Actions where some significant changes/fixes where made in the meantime.

@sljeff
Copy link

sljeff commented Aug 11, 2023

It should be noted that:

  • by default, k8s injects the HOSTNAME environment variable into pods, and its value is the pod name.

So in k8s, the server will never listen on 0.0.0.0 after updating to 13.4.13.
Unless we update the deployments or start commands.

@GVALFER
Copy link

GVALFER commented Aug 11, 2023

i have a similar issue

TypeError: fetch failed
    at Object.fetch (node:internal/deps/undici/undici:11457:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async invokeRequest (/root/admin/node_modules/next/dist/server/lib/server-ipc/invoke-request.js:21:12)
    at async requestHandler (/root/admin/node_modules/next/dist/server/lib/start-server.js:336:33)
    at async Server.<anonymous> (/root/admin/node_modules/next/dist/server/lib/start-server.js:152:13) {
  cause: NotSupportedError: expect header not supported
      at processHeader (node:internal/deps/undici/undici:8261:15)
      at new Request (node:internal/deps/undici/undici:8120:13)
      at [dispatch] (node:internal/deps/undici/undici:9178:25)
      at Intercept (node:internal/deps/undici/undici:8917:20)
      at [Intercepted Dispatch] (node:internal/deps/undici/undici:7657:16)
      at Client.dispatch (node:internal/deps/undici/undici:7673:44)
      at [dispatch] (node:internal/deps/undici/undici:7892:32)
      at Pool.dispatch (node:internal/deps/undici/undici:7673:44)
      at [dispatch] (node:internal/deps/undici/undici:10442:27)
      at Agent.Intercept (node:internal/deps/undici/undici:8917:20) {
    code: 'UND_ERR_NOT_SUPPORTED'
  }
}

@leerob leerob added linear: next Confirmed issue that is tracked by the Next.js team. and removed please add a complete reproduction The issue lacks information for further investigation type: needs triage labels Aug 12, 2023
@zurgul
Copy link

zurgul commented Aug 12, 2023

  • by default, k8s injects the HOSTNAME environment variable into pods, and its value is the pod name.

So in k8s, the server will never listen on 0.0.0.0 after updating to 13.4.13. Unless we update the deployments or start commands.

we use k8s as well, however I was able to reproduce the problem with HOSTNAME locally using only Docker. When HOSTNAME was set in Dockerfile the issue was resolved locally and in k8s cluster

@leerob
Copy link
Member

leerob commented Aug 12, 2023

Thank you for providing the minimal reproduction – I've marked this one to be dug into further. In the meantime, please remain on 13.4.12 or lower while we dig into this 🙏

@kodiakhq kodiakhq bot closed this as completed in #53131 Aug 14, 2023
kodiakhq bot pushed a commit that referenced this issue Aug 14, 2023
### What?
This PR makes it easier to use Next.js with IPv6 hostnames such as `::1` and `::`.

### How?
It does so by removing rewrites from `localhost` to `127.0.0.1` introduced in #52492. It also fixes the issue where Next.js tries to fetch something like `http://::1:3000` when `--hostname` is `::1` as it is not a valid URL (browsers' `URL` class throws an error when constructed with such hosts). It also fixes `NextURL` so that it doesn't accept `http://::1:3000` but refuse `http://[::1]:3000`. It also changes `next/src/server/lib/setup-server-worker.ts` so that it uses the server's `address` method to retrieve the host instead of our provided `opts.hostname`, ensuring that no matter what `opts.hostname` is we will always get the correct one.

### Note
I've verified that `next dev`, `next start` and `node .next/standalone/server.js` work with IPv6 hostnames (such as `::` and `::1`), IPv4 hostnames (such as `127.0.0.1`, `0.0.0.0`) and `localhost` - and with any of these hostnames fetching to `localhost` also works. Server Actions and middleware have no problems as well.

This also removes `.next/standalone/server.js`'s logging as we now use `start-server`'s logging to avoid duplicates. `start-server`'s logging has also been updated to report the actual hostname.
![image](https://github.com/vercel/next.js/assets/75556609/cefa5f23-ff09-4cef-a055-13eea7c11d89)
![image](https://github.com/vercel/next.js/assets/75556609/619e82ce-45d9-47b7-8644-f4ad083429db)
The above pictures also demonstrate using Server Actions with Next.js after this PR.
![image](https://github.com/vercel/next.js/assets/75556609/3d4166e9-f950-4390-bde9-af2547658148)

Fixes #53171
Fixes #49578
Closes NEXT-1510

Co-authored-by: Tim Neutkens <6324199+timneutkens@users.noreply.github.com>
Co-authored-by: Zack Tanner <1939140+ztanner@users.noreply.github.com>
@leerob
Copy link
Member

leerob commented Aug 14, 2023

This has been fixed ✅

@XEngine
Copy link

XEngine commented Aug 22, 2023

no it's not fixed. At least not for me. Kubernete pods are assigned default HOSTNAME
image

however when in 13.4.19 app somehow listens an ip address instead of the hostname and many restarts after, liveness probe detects it's somehow alive, next (node) begins to give ssl issues.
image

does anybody know any workaround for this?

@DuCanhGH
Copy link
Contributor

DuCanhGH commented Aug 22, 2023

@XEngine that doesn't look like an URL, nor an IP to me, which makes me wonder if that environment variable was even meant to be used like this. Can you check what did Next.js log back in older versions? I mean, 803bbe5 fixed the standalone mode not getting process.env just a few đays ago. Note that the IP that Next.js logs is due to it using require("http").Server::address() instead of process.env.HOSTNAME.

@XEngine
Copy link

XEngine commented Aug 22, 2023

Ye. Before it was starting like this:

Listening on port 3000 url: http://store-herrenausstatter-58cd58f98d-n92sf:3000

and it was working until 13.4.12. After that version whole thing begins to fail. Our environment still the same, the hostname you see is the same hostname format. I have tried to run my dockerfile with 0.0.0.0, 127.0.0.1, [::1], localhost it started with by not sending a custom HOSTNAME env. but received those ssl issues I don't know.

@DuCanhGH
Copy link
Contributor

@XEngine if it broke in 13.4.12 then perhaps it's not really related to my fix and this issue. Maybe I'll investigate into it later if possible :) That IP being shown is definitely due to my PR though, but you shouldn't worry about that.

@XEngine
Copy link

XEngine commented Aug 22, 2023

@XEngine if it broke in 13.4.12 then perhaps it's not really related to my fix and this issue. Maybe I'll investigate into it later if possible :) That IP being shown is definitely due to my PR though, but you shouldn't worry about that.

Sorry for my mistake on version definition. Correction : Working fine until 13.4.13.

@DuCanhGH
Copy link
Contributor

@XEngine yeah if it's 13.4.13 then it will be a lot more painful 💀

@izi-p
Copy link

izi-p commented Aug 24, 2023

nextjs: ^13.4.13 I looked into nextjs code and found a possible cause of the issue. The solution I found worked fine when nextjs app was started via Docker container.

If you look into .next/standalone/server.js you'll see the following lines of code

const hostname = process.env.HOSTNAME || 'localhost'
...
startServer({
...
  hostname: hostname === 'localhost' ? '0.0.0.0' : hostname,
...

It seems that process.env.HOSTNAME was set by docker builder to some random string, therefore nextjs server was started with that random hostname instead of '0.0.0.0'. This means that connections was only accepted if that random hostname was used in any client request.

Workaround solution

server inside of container should accept connections by any address (localhost, 127.0.0.1, ...) - so it must be 0.0.0.0. To achieve this I set HOSTNAME in Dockerfile to localhost, hence in the aforementioned .next/standalone/server.js it was resolved as hostname='0.0.0.0' prop for the startServer. Hope this will help someone.

another fix for that is to add a postbuild script that rewrites a bit server.js by deleting the process.env.HOSTNAME conditon.
Our app already have this env var but it is used for front-end, not for infrastructure purpose. Thus it relsulted in conflicts.

"build": "next build && postbuild",
"postbuild": "sed -i'.bak' 's/process.env.HOSTNAME || //g' .next/standalone/packages/MY_APP/server.js",

@izi-p
Copy link

izi-p commented Aug 28, 2023

Hello,

Same issue on our side.
(Build standalone+docker+kubernetes).

App is running fine locally outside docker container.

Everything did work on kubernetes env for next.js 13.4.12.

When using 13.4.13, readiness probe or liveness probe cannot be reached on kubernetes pods with following error
Liveness probe failed: Get "http://172.16.*.*:3000/healthz": dial tcp 172.16.*.*:3000: connect: connection refused

Please note as well that our application use a env var named "HOSTNAME" and that it does conflict with the process.env.HOSTNAME used inside the generated next server.js file. I had to implement a postbuild script to rewrite server.js to not check this env var and always use localhost

It is not fixed on v13.4.20-canary.9.

Please reopen this issue

@DuCanhGH
Copy link
Contributor

DuCanhGH commented Aug 28, 2023

@izi-p does adding ENV HOSTNAME localhost not work? Why would you need to override server.js?

@izi-p
Copy link

izi-p commented Aug 28, 2023

@DuCanhGH We need to override it because our application use a runtime variable named "HOSTNAME" and conflicts with the one server.js try to use. It seems to work by replacing hostname with 0.0.0.0 inside server.js

@vpmedia
Copy link

vpmedia commented Aug 28, 2023

Hi!
I've spent about a whole working day debugging ECONNRESET/ECONNREFUSED/Server unexpectedly closed the connection errors while trying to run my next.js application inside a Docker container in localhost.
My solution is to downgrade the node.js image to node:16-bullseye (all the other later versions and/or different hostname changes did not help).
I've worked with two different next.js apps, 13.2.4 was exited with error codes, 13.4.19 was hanging while trying to establish connection without success.

@DuCanhGH
Copy link
Contributor

DuCanhGH commented Sep 2, 2023

@izi-p just asking, does HOSTNAME=0.0.0.0 node .next/standalone/server.js not work?

@DuCanhGH
Copy link
Contributor

DuCanhGH commented Sep 2, 2023

@XEngine hopefully #54926 should fix your use case :)

@ctsstc
Copy link

ctsstc commented Sep 7, 2023

I've been losing my mind as well with too much time sunk into this...

I'm on Next version: 13.4.19

I constantly get these errors in my logs; I imagine they're from K8's probes for liveness and readiness; even though the container launches just fine, but later fails with 500 errors it seems for those probes. I used to utilize a basic health check page, then I tried utilizing an API route that returned a simple 200 OK for the body.

It seems odd to me that the localPort is always changing, but maybe that's the internal port?

Errors

- ready started server on [::]:3000, url: http://localhost:3000
TypeError: fetch failed
    at Object.fetch (node:internal/deps/undici/undici:11576:11)
    at async invokeRequest (/app/node_modules/next/dist/server/lib/server-ipc/invoke-request.js:17:12)
    at async invokeRender (/app/node_modules/next/dist/server/lib/router-server.js:254:29)
    at async handleRequest (/app/node_modules/next/dist/server/lib/router-server.js:447:24)
    at async requestHandler (/app/node_modules/next/dist/server/lib/router-server.js:464:13)
    at async Server.<anonymous> (/app/node_modules/next/dist/server/lib/start-server.js:117:13) {
  cause: SocketError: other side closed
      at Socket.onSocketEnd (/app/node_modules/next/dist/compiled/undici/index.js:1:63301)
      at Socket.emit (node:events:526:35)
      at endReadableNT (node:internal/streams/readable:1376:12)
      at process.processTicksAndRejections (node:internal/process/task_queues:82:21) {
    code: 'UND_ERR_SOCKET',
    socket: {
      localAddress: '::1',
      localPort: 43952,
      remoteAddress: undefined,
      remotePort: undefined,
      remoteFamily: undefined,
      timeout: undefined,
      bytesWritten: 999,
      bytesRead: 542
    }
  }
}
TypeError: fetch failed
    at Object.fetch (node:internal/deps/undici/undici:11576:11)
    at async invokeRequest (/app/node_modules/next/dist/server/lib/server-ipc/invoke-request.js:17:12)
    at async invokeRender (/app/node_modules/next/dist/server/lib/router-server.js:254:29)
    at async handleRequest (/app/node_modules/next/dist/server/lib/router-server.js:447:24)
    at async requestHandler (/app/node_modules/next/dist/server/lib/router-server.js:464:13)
    at async Server.<anonymous> (/app/node_modules/next/dist/server/lib/start-server.js:117:13) {
  cause: Error: read ECONNRESET
      at TCP.onStreamRead (node:internal/stream_base_commons:217:20) {
    errno: -104,
    code: 'ECONNRESET',
    syscall: 'read'
  }
}

I also tried setting HOSTNAME "0.0.0.0" in my Dockerfile but that does not help. It's also odd the bootup message still says: - ready started server on [::]:3000, url: http://localhost:3000

Oddities

I just forwarded a tunnel to my container and the root page loaded up quickly, which is what I was using for probes previously, but this time when I tried to hit my new /api/health-check it was taking a long time so I opened up another page to it, and it eventually loaded, and the first request gave me a ERR_CONNECTION_RESET on Chrome. Oddly it doesn't seem anything showed up in the logs related to this though 🤔

It seems that these requests are majorly lagging now or failing. But if I access the app from the domain on the internet everything is loading up quickly and smoothly oddly... 😔

The first request on the tunnel to a page goes through quickly, but then any subsequent request lags horribly or times out 😬 Utilizing curl on the container via localhost seems to consistently work as well.

@github-actions
Copy link
Contributor

This closed issue has been automatically locked because it had no new activity for 2 weeks. If you are running into a similar issue, please create a new issue with the steps to reproduce. Thank you.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Issue was opened via the bug report template. linear: next Confirmed issue that is tracked by the Next.js team. locked
Projects
None yet
Development

Successfully merging a pull request may close this issue.