Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactored and optimized rendering #1005

Merged
merged 1 commit into from
Apr 19, 2018
Merged

Refactored and optimized rendering #1005

merged 1 commit into from
Apr 19, 2018

Conversation

christopheraue
Copy link
Contributor

Measures:

  1. A while loop is faster than iterating with #each.
  2. Check string tokens first. String tokens are far more frequent
    than interrupt tokens. If a token is a string token, checking
    for an interrupt token can be avoided.
  3. String tokens just map to themselves and don't need the special
    treatment of BlockBody#render_node (except the resource limit
    check).

Benchmark

$ bundle exec rake benchmark:run

Before

Run 1)
              parse:     41.630  (± 0.0%) i/s -    420.000  in  10.089309s
             render:     75.962  (± 3.9%) i/s -    763.000  in  10.066823s
     parse & render:     25.497  (± 0.0%) i/s -    256.000  in  10.040862s

Run 2)
              parse:     42.130  (± 0.0%) i/s -    424.000  in  10.064738s
             render:     77.003  (± 1.3%) i/s -    777.000  in  10.093524s
     parse & render:     25.739  (± 0.0%) i/s -    258.000  in  10.024581s

Run 3)
              parse:     41.976  (± 2.4%) i/s -    420.000  in  10.021406s
             render:     76.184  (± 1.3%) i/s -    763.000  in  10.018104s
     parse & render:     25.641  (± 0.0%) i/s -    258.000  in  10.062549s

After

Run 1)
              parse:     42.198  (± 0.0%) i/s -    424.000  in  10.048737s
             render:     79.495  (± 2.5%) i/s -    798.000  in  10.042610s
     parse & render:     25.874  (± 3.9%) i/s -    260.000  in  10.053336s

Run 2)
              parse:     41.961  (± 0.0%) i/s -    420.000  in  10.009858s
             render:     78.623  (± 1.3%) i/s -    791.000  in  10.064043s
     parse & render:     25.927  (± 0.0%) i/s -    260.000  in  10.028472s

Run 3)
              parse:     42.321  (± 2.4%) i/s -    424.000  in  10.021159s
             render:     79.468  (± 2.5%) i/s -    798.000  in  10.048127s
     parse & render:     26.065  (± 0.0%) i/s -    262.000  in  10.053814s

Copy link
Contributor

@fw42 fw42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few questions, but I don't have any strong objections here. @dylanahsmith wdyt?

# Break out if we have any unhanded interrupts.
idx = 0
while node = @nodelist[idx]
render_node_to_output(node, output, context)
break if context.interrupt?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. Not sure I understand why this break was at the beginning of the loop body before but now it's at the end of the loop body. Could you explain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an interrupt happens during render_node_to_output it seems unnecessary to wait for the next iteration of the loop to break out of it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could avoid this check entirely if token.is_a?(String) and we didn't extract code out into the render_node_to_output method. Perhaps we could instead move the exception handling into render_node to simplify this method without losing the ability to call break inside the case statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I'll try that.

def render_node_to_output(node, output, context)
case node
when String
node_output = node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this extra variable necessary? Why not

check_resources(context, node)
output << node

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's noI necessary. I probably left it there because of the symmetry with the other branches. Should I change it?

context.push_interrupt(node.interrupt)
when Block
node_output = render_node(node, context)
output << node_output unless node.blank?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If node.blank? is true, did we actually have to call render_node? Or could we check that before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

render_node may have side effects for block tags. At least the test suite lights up, if blank? it's checked before.

@fw42
Copy link
Contributor

fw42 commented Apr 18, 2018

Thanks for your contributions!

Copy link
Contributor

@dylanahsmith dylanahsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimizing rending of strings is a good idea.

# Interrupt is any command that stops block execution such as {% break %}
# or {% continue %}
context.push_interrupt(node.interrupt)
when Block
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually benefit from separating this out into its own case rather than doing

          node_output = render_node(token, context)

          unless token.is_a?(Block) && token.blank?
            output << node_output
          end

in the else case as we did before? It seems like the same amount of work is done, since the only difference is whether token.is_a?(Block) is checked by the when Block or by unless token.is_a?(Block).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was more "refactor" than "optimize". It's a little less DRY, but I find it cleaner and easier to read. It comes down to personal taste, I guess.

@nodelist.each do |token|
# Break out if we have any unhanded interrupts.
idx = 0
while node = @nodelist[idx]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A while loop is faster than iterating with #each.

This statement surprised me, so I created this microbenchmark

require 'benchmark'

N = 100000
LIST = (0..1000).to_a

Benchmark.bmbm do |x|
  x.report("while") do
    N.times do
      LIST.each do |item|
      end
    end
  end
  x.report("each") do
    N.times do
      idx = 0
      while item = LIST[idx]
        idx += 1
      end
    end
  end
end

the results were

            user     system      total        real
while   3.470000   0.010000   3.480000 (  3.480269)
each    2.230000   0.000000   2.230000 (  2.232083)

so I think this change might have made things worse, even if the overall benchmark improved from the other changes in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When running your benchmark I get the same result.

But when running the liquid benchmark while appears to be faster than each. Here are three runs using each:

Run 1)
              parse:     41.466  (± 2.4%) i/s -    416.000  in  10.039060s
             render:     77.517  (± 3.9%) i/s -    777.000  in  10.036251s
     parse & render:     25.457  (± 3.9%) i/s -    256.000  in  10.060507s

Run 2)
              parse:     41.188  (± 0.0%) i/s -    412.000  in  10.004006s
             render:     77.819  (± 2.6%) i/s -    784.000  in  10.081885s
     parse & render:     25.772  (± 3.9%) i/s -    258.000  in  10.015315s

Run 3)
              parse:     41.479  (± 0.0%) i/s -    416.000  in  10.030267s
             render:     76.804  (± 6.5%) i/s -    770.000  in  10.071492s
     parse & render:     24.713  (± 4.0%) i/s -    248.000  in  10.059831s

Maybe one of you can check that too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I tried with only the while loop change it did improve performance on the benchmark.

# Break out if we have any unhanded interrupts.
idx = 0
while node = @nodelist[idx]
render_node_to_output(node, output, context)
break if context.interrupt?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could avoid this check entirely if token.is_a?(String) and we didn't extract code out into the render_node_to_output method. Perhaps we could instead move the exception handling into render_node to simplify this method without losing the ability to call break inside the case statement.

Measures:
1) A while loop is faster than iterating with #each.
2) Check string, variable and block tokens first. They are far more
   frequent than interrupt tokens. In their case, checking for an
   interrupt can be avoided.
3) String tokens just map to themselves and don't need the special
   treatment of BlockBody#render_node (except the resource limit
   check).

Benchmark
=========

$ bundle exec rake benchmark:run

Before
------

Run 1)
              parse:     41.630  (± 0.0%) i/s -    420.000  in  10.089309s
             render:     75.962  (± 3.9%) i/s -    763.000  in  10.066823s
     parse & render:     25.497  (± 0.0%) i/s -    256.000  in  10.040862s

Run 2)
              parse:     42.130  (± 0.0%) i/s -    424.000  in  10.064738s
             render:     77.003  (± 1.3%) i/s -    777.000  in  10.093524s
     parse & render:     25.739  (± 0.0%) i/s -    258.000  in  10.024581s

Run 3)
              parse:     41.976  (± 2.4%) i/s -    420.000  in  10.021406s
             render:     76.184  (± 1.3%) i/s -    763.000  in  10.018104s
     parse & render:     25.641  (± 0.0%) i/s -    258.000  in  10.062549s

After
-----

Run 1)
              parse:     42.283  (± 0.0%) i/s -    424.000  in  10.028306s
             render:     83.158  (± 2.4%) i/s -    832.000  in  10.009201s
     parse & render:     26.417  (± 0.0%) i/s -    266.000  in  10.069718s

Run 2)
              parse:     41.159  (± 4.9%) i/s -    412.000  in  10.031297s
             render:     81.591  (± 3.7%) i/s -    816.000  in  10.018225s
     parse & render:     25.924  (± 3.9%) i/s -    260.000  in  10.035653s

Run 3)
              parse:     42.418  (± 2.4%) i/s -    424.000  in  10.003100s
             render:     84.183  (± 2.4%) i/s -    847.000  in  10.069781s
     parse & render:     26.726  (± 0.0%) i/s -    268.000  in  10.029857s
@christopheraue
Copy link
Contributor Author

Taking all comments into account I updated the code. The case statement and break are now directly inside the loop again. This already increased performance. I then also moved the handling of Variable and Block tokens to the top of the case statement. That gave the performance another push.

These are the new results:

Run 1)
              parse:     42.283  (± 0.0%) i/s -    424.000  in  10.028306s
             render:     83.158  (± 2.4%) i/s -    832.000  in  10.009201s
     parse & render:     26.417  (± 0.0%) i/s -    266.000  in  10.069718s

Run 2)
              parse:     41.159  (± 4.9%) i/s -    412.000  in  10.031297s
             render:     81.591  (± 3.7%) i/s -    816.000  in  10.018225s
     parse & render:     25.924  (± 3.9%) i/s -    260.000  in  10.035653s

Run 3)
              parse:     42.418  (± 2.4%) i/s -    424.000  in  10.003100s
             render:     84.183  (± 2.4%) i/s -    847.000  in  10.069781s
     parse & render:     26.726  (± 0.0%) i/s -    268.000  in  10.029857s

Copy link
Contributor

@fw42 fw42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome

@dylanahsmith dylanahsmith merged commit 5f0b64c into Shopify:master Apr 19, 2018
@christopheraue christopheraue deleted the render_refactor branch April 19, 2018 21:38
@shopify-shipit shopify-shipit bot temporarily deployed to rubygems-4-0-stable January 11, 2023 15:52 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants