Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix slowness on last page (with ransack) #696

Closed
wants to merge 1 commit into from

Conversation

jules-w2
Copy link

@jules-w2 jules-w2 commented May 4, 2024

#config/initializers/pagy.rb
Pagy::DEFAULT[:items] = 50

On my example, i have 14553 entries and i want the entries of the last page ( page=292)

AwsEmail.where("created_at >= ? AND created_at <= ?", "2024-05-01 04:00:00", "2024-06-01 03:59:59").where(:event_type => "Delivery").count
# 14553

In my controller

def emails
    puts("Start emails")
    t0                      = Time.now
    @search                 = AwsEmail.select(:id).ransack(params[:q])
    @search.sorts           = "id desc" if @search.sorts.empty?
    @search.event_type_eq   = "Delivery"
    @search.created_at_gteq = (Date.today.at_beginning_of_day - 1.day).beginning_of_month.beginning_of_day
    @search.created_at_lteq = (Date.today.at_beginning_of_month.next_month - 1.day).end_of_day
    @pagy, @emails          = pagy(@search.result(:distinct => true))
    puts("emails size => #{@emails.to_a.size}")
    puts(Time.now - t0)
  end

With the current gem version (LIMIT 50 OFFSET 14550), it takes 60 seconds

Start emails
AwsEmail Count (645.8ms)  SELECT COUNT(*) FROM (SELECT DISTINCT `aws_emails`.`id` FROM `aws_emails` WHERE (`aws_emails`.`event_type` = 'Delivery' AND `aws_emails`.`created_at` >= '2024-05-01 04:00:00' AND `aws_emails`.`created_at` <= '2024-06-01 03:59:59') ORDER BY `aws_emails`.`id` DESC) subquery_for_count
↳ app/controllers/dashboard_controller.rb:48:in `emails'
AwsEmail Load (60024.6ms)  SELECT DISTINCT `aws_emails`.`id` FROM `aws_emails` WHERE (`aws_emails`.`event_type` = 'Delivery' AND `aws_emails`.`created_at` >= '2024-05-01 04:00:00' AND `aws_emails`.`created_at` <= '2024-06-01 03:59:59') ORDER BY `aws_emails`.`id` DESC LIMIT 50 OFFSET 14550
↳ app/controllers/dashboard_controller.rb:49:in `emails'
emails size => 3
60.684757

With my pull request (LIMIT 3 OFFSET 14550) it takes 1 seconde

Start emails
AwsEmail Count (536.9ms)  SELECT COUNT(*) FROM (SELECT DISTINCT `aws_emails`.`id` FROM `aws_emails` WHERE (`aws_emails`.`event_type` = 'Delivery' AND `aws_emails`.`created_at` >= '2024-05-01 04:00:00' AND `aws_emails`.`created_at` <= '2024-06-01 03:59:59') ORDER BY `aws_emails`.`id` DESC) subquery_for_count
↳ app/controllers/dashboard_controller.rb:48:in `emails'
AwsEmail Load (348.1ms)  SELECT DISTINCT `aws_emails`.`id` FROM `aws_emails` WHERE (`aws_emails`.`event_type` = 'Delivery' AND `aws_emails`.`created_at` >= '2024-05-01 04:00:00' AND `aws_emails`.`created_at` <= '2024-06-01 03:59:59') ORDER BY `aws_emails`.`id` DESC LIMIT 3 OFFSET 14550
↳ app/controllers/dashboard_controller.rb:49:in `emails'
emails size => 3
0.905018

I also noticed that when commenting @search.sorts, the intermediate select COUNT is different (without ORDER BY aws_emails.id DESC) and the query is fast with the current version of the gem.. but this order is important for the views.
so I don't know if this is another possible solution path...

def emails
    puts("Start emails")
    t0                      = Time.now
    @search                 = AwsEmail.select(:id).ransack(params[:q])
    # @search.sorts           = "id desc" if @search.sorts.empty?
    @search.event_type_eq   = "Delivery"
    @search.created_at_gteq = (Date.today.at_beginning_of_day - 1.day).beginning_of_month.beginning_of_day
    @search.created_at_lteq = (Date.today.at_beginning_of_month.next_month - 1.day).end_of_day
    @pagy, @emails          = pagy(@search.result(:distinct => true))
    puts("emails size => #{@emails.to_a.size}")
    puts(Time.now - t0)
  end

Start emails
AwsEmail Count (391.2ms)  SELECT COUNT(*) FROM (SELECT DISTINCT `aws_emails`.`id` FROM `aws_emails` WHERE (`aws_emails`.`event_type` = 'Delivery' AND `aws_emails`.`created_at` >= '2024-05-01 04:00:00' AND `aws_emails`.`created_at` <= '2024-06-01 03:59:59')) subquery_for_count
↳ app/controllers/dashboard_controller.rb:48:in `emails'
AwsEmail Load (340.6ms)  SELECT DISTINCT `aws_emails`.`id` FROM `aws_emails` WHERE (`aws_emails`.`event_type` = 'Delivery' AND `aws_emails`.`created_at` >= '2024-05-01 04:00:00' AND `aws_emails`.`created_at` <= '2024-06-01 03:59:59') LIMIT 50 OFFSET 14550
↳ app/controllers/dashboard_controller.rb:49:in `emails'
emails size => 3
0.752417

@ddnexus
Copy link
Owner

ddnexus commented May 4, 2024

Hi @jules-w2, interesting.

How long it takes to get page 1 and page last-1 with the original implementation?

Also, please try page 1 and last-1 with items: 3

@jules-w2
Copy link
Author

jules-w2 commented May 4, 2024

hey @ddnexus

With the original implementation , it takes less than 1 second in both cases

For the first page

Start emails
AwsEmail Count (542.7ms)  SELECT COUNT(*) FROM (SELECT DISTINCT `aws_emails`.`id` FROM `aws_emails` WHERE (`aws_emails`.`event_type` = 'Delivery' AND `aws_emails`.`created_at` >= '2024-05-01 04:00:00' AND `aws_emails`.`created_at` <= '2024-06-01 03:59:59') ORDER BY `aws_emails`.`id` DESC) subquery_for_count
↳ app/controllers/dashboard_controller.rb:49:in `emails'
AwsEmail Load (2.2ms)  SELECT DISTINCT `aws_emails`.`id` FROM `aws_emails` WHERE (`aws_emails`.`event_type` = 'Delivery' AND `aws_emails`.`created_at` >= '2024-05-01 04:00:00' AND `aws_emails`.`created_at` <= '2024-06-01 03:59:59') ORDER BY `aws_emails`.`id` DESC LIMIT 50 OFFSET 0
↳ app/controllers/dashboard_controller.rb:50:in `emails'
emails size => 50
0.590234

For the last-1 (291)

Start emails
AwsEmail Count (561.3ms)  SELECT COUNT(*) FROM (SELECT DISTINCT `aws_emails`.`id` FROM `aws_emails` WHERE (`aws_emails`.`event_type` = 'Delivery' AND `aws_emails`.`created_at` >= '2024-05-01 04:00:00' AND `aws_emails`.`created_at` <= '2024-06-01 03:59:59') ORDER BY `aws_emails`.`id` DESC) subquery_for_count
↳ app/controllers/dashboard_controller.rb:49:in `emails'
AwsEmail Load (343.1ms)  SELECT DISTINCT `aws_emails`.`id` FROM `aws_emails` WHERE (`aws_emails`.`event_type` = 'Delivery' AND `aws_emails`.`created_at` >= '2024-05-01 04:00:00' AND `aws_emails`.`created_at` <= '2024-06-01 03:59:59') ORDER BY `aws_emails`.`id` DESC LIMIT 50 OFFSET 14500
↳ app/controllers/dashboard_controller.rb:50:in `emails'
emails size => 50
0.912723

Test with ìtems:3 on Page 1 (less than a second)

Start emails
AwsEmail Count (535.1ms)  SELECT COUNT(*) FROM (SELECT DISTINCT `aws_emails`.`id` FROM `aws_emails` WHERE (`aws_emails`.`event_type` = 'Delivery' AND `aws_emails`.`created_at` >= '2024-05-01 04:00:00' AND `aws_emails`.`created_at` <= '2024-06-01 03:59:59') ORDER BY `aws_emails`.`id` DESC) subquery_for_count
↳ app/controllers/dashboard_controller.rb:49:in `emails'
AwsEmail Load (1.4ms)  SELECT DISTINCT `aws_emails`.`id` FROM `aws_emails` WHERE (`aws_emails`.`event_type` = 'Delivery' AND `aws_emails`.`created_at` >= '2024-05-01 04:00:00' AND `aws_emails`.`created_at` <= '2024-06-01 03:59:59') ORDER BY `aws_emails`.`id` DESC LIMIT 3 OFFSET 0
↳ app/controllers/dashboard_controller.rb:50:in `emails'
emails size => 3
0.568543

Test with ìtems:3 on last -1 page (less than a second)

Start emails
AwsEmail Count (569.7ms)  SELECT COUNT(*) FROM (SELECT DISTINCT `aws_emails`.`id` FROM `aws_emails` WHERE (`aws_emails`.`event_type` = 'Delivery' AND `aws_emails`.`created_at` >= '2024-05-01 04:00:00' AND `aws_emails`.`created_at` <= '2024-06-01 03:59:59') ORDER BY `aws_emails`.`id` DESC) subquery_for_count
↳ app/controllers/dashboard_controller.rb:49:in `emails'
AwsEmail Load (345.0ms)  SELECT DISTINCT `aws_emails`.`id` FROM `aws_emails` WHERE (`aws_emails`.`event_type` = 'Delivery' AND `aws_emails`.`created_at` >= '2024-05-01 04:00:00' AND `aws_emails`.`created_at` <= '2024-06-01 03:59:59') ORDER BY `aws_emails`.`id` DESC LIMIT 3 OFFSET 14547
↳ app/controllers/dashboard_controller.rb:50:in `emails'
emails size => 3
0.937168

Test with ìtems:3 on last page (less than a second because 14553 is divisible by 3)

 14553.divmod(3)
 => [4851, 0]
Start emails
AwsEmail Count (571.9ms)  SELECT COUNT(*) FROM (SELECT DISTINCT `aws_emails`.`id` FROM `aws_emails` WHERE (`aws_emails`.`event_type` = 'Delivery' AND `aws_emails`.`created_at` >= '2024-05-01 04:00:00' AND `aws_emails`.`created_at` <= '2024-06-01 03:59:59') ORDER BY `aws_emails`.`id` DESC) subquery_for_count
↳ app/controllers/dashboard_controller.rb:49:in `emails'
AwsEmail Load (346.4ms)  SELECT DISTINCT `aws_emails`.`id` FROM `aws_emails` WHERE (`aws_emails`.`event_type` = 'Delivery' AND `aws_emails`.`created_at` >= '2024-05-01 04:00:00' AND `aws_emails`.`created_at` <= '2024-06-01 03:59:59') ORDER BY `aws_emails`.`id` DESC LIMIT 3 OFFSET 14550
↳ app/controllers/dashboard_controller.rb:50:in `emails'
emails size => 3
0.937943

Slows are on the last page when the limit exceeds the total number

@ddnexus
Copy link
Owner

ddnexus commented May 4, 2024

Slows are on the last page when the limit exceeds the total number

That seems to be some specific quirk (that has nothing to do with pagy) maybe coming from some combination of factors.

Just guessing, but unless we can find that it is a general problem happening all the times for all the apps, I would rather avoid to add a condition for an external (pagy-unrelated) case that may potentially be happening... almost never.

If it's a very specific case, it would be more appropriate to add some note in the doc explaining when to override the method in that particular case.

Could you investigate whether it's the particular DB, driver, AR, ransack your models, the query or something else?

PS: I am also curious to know the time with items:3 last page, but a total record of 14554 or 14555

@ddnexus
Copy link
Owner

ddnexus commented May 5, 2024

I may have a solution that can be generally applied (no conditions) without overhead or maintenance burden, still solving your specific problem!

I've got something that passes all the tests. Let me think a bit more about any possible consequence of the changes...

ddnexus added a commit that referenced this pull request May 5, 2024
- improve the performance of the last page in
particular storage conditions (see #696)
@ddnexus
Copy link
Owner

ddnexus commented May 6, 2024

Implemented in 8.4.0. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants