Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] System freezing after attachment download #1042

Open
jendib opened this issue Dec 29, 2021 · 8 comments
Open

[BUG] System freezing after attachment download #1042

jendib opened this issue Dec 29, 2021 · 8 comments
Labels
type: bug something is broken, we need to fix it

Comments

@jendib
Copy link

jendib commented Dec 29, 2021

Fider Cloud or Self Hosted
Fider self hoster 0.19.1

Describe the bug
Each time a user downloads an attachment through the static/images/attachments endpoint any following HTTP request hangs and timeouts with a begin_error error in logs (nothing else). The only way to fix that is by restarting the app.

I disabled SQL logs thinking it would be somehow linked but it didn't solve the issue.

@jendib jendib added the type: bug something is broken, we need to fix it label Dec 29, 2021
@goenning
Copy link
Member

goenning commented Jan 7, 2022

  1. does it happen with all images or a specific one?
  2. can you reproduce it constantly or is it intermittent?

I've seen similar reports before, but I honestly don't know how to reproduce. I've hosted Fider for 5 years and I never seen that issue.

@jendib
Copy link
Author

jendib commented Jan 7, 2022

Unfortunately it's quite random :( I have an automatic restart when it freezes and I got more than 50 restarts in a few months. I don't think my users are doing anything special, it's mostly PNG screenshots as attachments.

@Mateus-Romera
Copy link

Hello, I'm having the same problem. It seems to be random and always when displaying images.
I added a LivenessProbe (k8s) so that the application is automatically restarted when this problem happens.

I can reproduce sometimes using a specific topic with some PNG attachments and then refreshing the entire page with the cache disabled.
Although even when using the LOG_LEVEL=debug, I don't feel that I have any useful logs:

  |   | INFO [2022-05-05T21:56:10Z] [WEB] GET http://fider.x.y.z/ finished with 0 in 1993ms (begin_error)
  |   | INFO [2022-05-05T21:56:08Z] [WEB] GET http://fider.x.y.z/ started
  |   | INFO [2022-05-05T21:56:05Z] [WEB] GET http://fider.x.y.z/ finished with 0 in 1995ms (begin_error)
  |   | INFO [2022-05-05T21:56:03Z] [WEB] GET http://fider.x.y.z/ started
  |   | INFO [2022-05-05T21:56:00Z] [WEB] GET http://fider.x.y.z/ finished with 0 in 2000ms (begin_error)
  |   | INFO [2022-05-05T21:55:58Z] [WEB] GET http://fider.x.y.z/ started
  |   | INFO [2022-05-05T21:55:55Z] [WEB] GET http://fider.x.y.z/ finished with 0 in 1999ms (begin_error)
  |   | INFO [2022-05-05T21:55:53Z] [WEB] GET http://fider.x.y.z/ started
  |   | INFO [2022-05-05T21:55:50Z] [WEB] GET http://fider.x.y.z finished with 0 in 1986ms (begin_error)
  |   | INFO [2022-05-05T21:55:48Z] [WEB] GET http://fider.x.y.z/ started
  |   | INFO [2022-05-05T21:55:45Z] [WEB] GET http://fider.x.y.z/ finished with 0 in 1999ms (begin_error)
  |   | INFO [2022-05-05T21:55:43Z] [WEB] GET http://fider.x.y.z/ started
  |   | INFO [2022-05-05T21:55:43Z] [WEB] GET http://fider.x.y.z/ finished with 0 in 4999ms (begin_error)
  |   | INFO [2022-05-05T21:55:40Z] [WEB] GET http://fider.x.y.z/ finished with 0 in 2000ms (begin_error)

@goenning
Copy link
Member

When that happens, could you please check if there are any open transactions on Postgres? I believe this could be connection exhaustion, but it'd be useful to have more data.

@jendib
Copy link
Author

jendib commented May 12, 2022

I'm almost sure it's not connection exhaustion because I have another app using the same database (so a shared max connections) and it doesn't have connection pool issues.

@goenning
Copy link
Member

goenning commented May 13, 2022

Fider has its own connection pool, that's the one I think could be exhausted. You could also try increasing the pool to see if this reduces how often it happens

MaxOpenConns int `env:"DATABASE_MAX_OPEN_CONNS,default=4,strict"`

I suspect this could be the reason because I only use S3 to store attachments.

@jendib
Copy link
Author

jendib commented May 16, 2022

@goenning I managed to catch the pg_stat_activity table when the issue happens and I have 4 opened connections in the "idle in transaction" state and query SELECT provider_uid, provider FROM user_providers WHERE user_id = $1. I didn't update the DATABASE_MAX_OPEN_CONNS env var yet so your idea seems to be proven.

@Mateus-Romera
Copy link

Fider has its own connection pool, that's the one I think could be exhausted. You could also try increasing the pool to see if this reduces how often it happens

MaxOpenConns int `env:"DATABASE_MAX_OPEN_CONNS,default=4,strict"`

I suspect this could be the reason because I only use S3 to store attachments.

Nice one!
I've always been able to reproduce this issue on a page containing 14 attachments and refreshing with cache disabled.
When I increased the DATABASE_MAX_OPEN_CONNS env the problem disappeared. But I will continue to monitor.

Thanks for now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug something is broken, we need to fix it
Projects
None yet
Development

No branches or pull requests

3 participants