Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix calculation of SQL runtime #50767

Merged
merged 1 commit into from Jan 16, 2024

Conversation

rporrasluc
Copy link
Contributor

Motivation / Background

This Pull Request has been created because this other PR seems to have introduced a bug in the db_runtime emitted as part of sql.active_record notification.

The time is calculated using ActiveSupport::Notifications.monotonic_subscribe which uses Process.clock_gettime(Process::CLOCK_MONOTONIC) with the default unit (seconds).

This being run in a Rails console prints 1.

ActiveSupport::Notifications.monotonic_subscribe("test_event") do |_name, start, finish, _id, payload|
  puts finish - start
end

ActiveSupport::Notifications.instrument("test_event") do
  sleep 1.second
end

Also, when an ActiveSupport::Notifications::Event is used, it needs to multiply start and ending by 1_000.0 same as we are proposing here.

Checklist

Before submitting the PR make sure the following are checked:

  • This Pull Request is related to one change. Changes that are unrelated should be opened in separate PRs.
  • Commit message has a detailed description of what changed and why. If this PR fixes a related issue include it in the commit message. Ex: [Fix #issue-number]
  • Tests are added or updated if you fix a bug or add a feature.
  • CHANGELOG files are updated for the changed libraries if there is a behavior change or additional feature. Minor bug fixes and documentation changes should not be included.

@byroot
Copy link
Member

byroot commented Jan 16, 2024

cc @fatkodima

@byroot byroot merged commit 758e6e1 into rails:main Jan 16, 2024
4 checks passed
byroot added a commit that referenced this pull request Jan 16, 2024
@fatkodima
Copy link
Member

So this just reverts the change made in #50493.
Tested again with this change, I got negative numbers for async queries again:

Completed 200 OK in 1069ms (Views: -7227.3ms | ActiveRecord: 1066614.0ms | Allocations: 5634)

@byroot
Copy link
Member

byroot commented Jan 16, 2024

Yes, but the bug must be elsewhere.

@rporrasluc
Copy link
Contributor Author

@fatkodima if you can provide a sample repository where I can investigate the misbehaviour I am willing to help. If you check the referenced code, seconds is what we receive from the monotonic subscribe, you can test that in a new Rails app too.

Our application was showing the contrary behaviour.

Using the changes in this PR, DB time displayed from the queries matched the total time for ActiveRecord. Without these changes, DB time displayed was several milliseconds, however ActiveRecord would always display 0.0ms because it was missing the multiplication by 1000.

@fatkodima
Copy link
Member

https://github.com/fatkodima/test_app

When hit localhost:3000, it shows something like

Completed 200 OK in 87ms (Views: -575.3ms | ActiveRecord: 1710.0ms | Allocations: 24762)

@rporrasluc
Copy link
Contributor Author

https://github.com/fatkodima/test_app

When hit localhost:3000, it shows something like

Completed 200 OK in 87ms (Views: -575.3ms | ActiveRecord: 1710.0ms | Allocations: 24762)

When async is not used, the numbers are correct.

  User Load (0.5ms)  SELECT "users".* FROM "users"
   app/views/users/index.html.erb:2
  Rendered users/index.html.erb within layouts/application (Duration: 1.1ms | Allocations: 858)
  Rendered layout layouts/application.html.erb (Duration: 4.0ms | Allocations: 2663)
Completed 200 OK in 6ms (Views: 4.0ms | ActiveRecord: 0.5ms | Allocations: 4297)

So the PR you previously proposed breaks sync in order to fix the async. So, in my opinion, the fix does not belong here. I am going to investigate where the wrong value comes from and will hopefully have a fix ready soon.

@rporrasluc
Copy link
Contributor Author

@byroot I think I have found out the origin of the issue, but I can't figure out an easy fix for it. I'll explain although I think it's simple:

  • Sync queries: They use an ActiveSupport::Notifications::Instrumenter whose notifier is an ActiveSupport::Notifications::Fanout. When instrument is called a Fanout::Handle which uses iterate_guarding_exceptions and for each group of listeners will call start. Regardless of monotonic or not monotonic subscribe, Process.clock_gettime(Process::CLOCK_MONOTONIC) and Time.now return seconds when we calculate the diff between them.
  • Async queries: They use ActiveRecord::FutureResult and build ActiveSupport::Notifications::Event events. event.record is called eventually calling now that in this case uses Process.clock_gettime(Process::CLOCK_MONOTONIC, :float_millisecond). The events are latter published using ActiveSupport::Notifications.publish_event.

I think the issue is not local to this, but broader due to the existing inconsistency in the unit used to measure time. Other example that makes me thing something is not too right.

Given this script.

republished = false

ActiveSupport::Notifications.monotonic_subscribe("test") do |name, start, finish, id, payload|
  puts "subscriber using start and finish ===="
  puts start
  puts finish
end

ActiveSupport::Notifications.monotonic_subscribe("test") do |event|
  puts "subscriber using event ===="
  puts event.time
  puts event.end

  republished_was = republished
  republished = true

  unless republished_was
    puts "Republishing..."
    ActiveSupport::Notifications.publish_event(event) unless republished_was
  end
end

ActiveSupport::Notifications.instrument("test") do
  sleep(1)
end

This is the output:

subscriber using start and finish ====
4108.368805
4109.369614
subscriber using event ====
4108368.826
4109369.962
Republishing...
subscriber using start and finish ====
4108368.826
4109369.962
subscriber using event ====
4108368.826
4109369.962

@byroot
Copy link
Member

byroot commented Jan 17, 2024

They use ActiveRecord::FutureResult and build ActiveSupport::Notifications::Event events. event.record is called eventually calling now that in this case uses Process.clock_gettime(Process::CLOCK_MONOTONIC, :float_millisecond)

Good catch. Should be easy to fix.

@fatkodima
Copy link
Member

I came up with the following solution, which seems like works

diff --git a/activerecord/lib/active_record/runtime_registry.rb b/activerecord/lib/active_record/runtime_registry.rb
index 408b36c3aa..03742b3ba4 100644
--- a/activerecord/lib/active_record/runtime_registry.rb
+++ b/activerecord/lib/active_record/runtime_registry.rb
@@ -34,7 +34,7 @@ def reset
 end

 ActiveSupport::Notifications.monotonic_subscribe("sql.active_record") do |name, start, finish, id, payload|
-  runtime = (finish - start) * 1_000.0
+  runtime = finish - start

   if payload[:async]
     ActiveRecord::RuntimeRegistry.async_sql_runtime += (runtime - payload[:lock_wait])
diff --git a/activesupport/lib/active_support/notifications/fanout.rb b/activesupport/lib/active_support/notifications/fanout.rb
index af70faf407..0a3299ad91 100644
--- a/activesupport/lib/active_support/notifications/fanout.rb
+++ b/activesupport/lib/active_support/notifications/fanout.rb
@@ -139,14 +139,14 @@ def finish(name, id, payload)
       class MonotonicTimedGroup < BaseTimeGroup # :nodoc:
         private
           def now
-            Process.clock_gettime(Process::CLOCK_MONOTONIC)
+            Process.clock_gettime(Process::CLOCK_MONOTONIC, :float_millisecond)
           end
       end

       class TimedGroup < BaseTimeGroup # :nodoc:
         private
           def now
-            Time.now
+            Time.now.to_f * 1000
           end
       end

So we use milliseconds everywhere. I am not very familiar with AS::Notifications internals, so maybe this is not a correct/best fix.
@byroot wdyt?

@rporrasluc
Copy link
Contributor Author

rporrasluc commented Jan 17, 2024

I came up with the following solution, which seems like works

I have been in conversation with @byroot and we agree on unifying units as best solution. Said that, Rails documentation would need to be amended if we do so as this would be a major change (started/finished are supposed to be time objects when subscribe is used), so probably it's better to change the other place (Event) not to use milliseconds but seconds.

@byroot
Copy link
Member

byroot commented Jan 17, 2024

as this would be a major change

The key is to unify unit internally and to do conversions as needed to keep public interfaces unchanged.

casperisfine pushed a commit to Shopify/rails that referenced this pull request Jan 17, 2024
Ref: rails#43502
Fix: rails#50767
Fix: rails#50493

When republishing a an event into a `start, finish` tuple, we need
to convert the timestamps back into seconds.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants