Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply error-checking to sort, sort_natural, where, uniq, map, compact filter(s) #1059

Merged
merged 1 commit into from
Feb 22, 2019

Conversation

itsGarland
Copy link
Contributor

@itsGarland itsGarland commented Jan 16, 2019

Some types (for example Arrays and Integers) cannot be indexed using strings. Previously this would raise a TypeError when attempted. Now we catch that error and return nil instead. Original issue was with map but the same issue was presented in many of the other filters. I also inlined the where implementation because the rescue block was no longer necessary in the class method.

  • Created get_property method that raises ArgumentError if property hash call raises TypeError

@itsGarland itsGarland requested review from Thibaut and pushrax January 16, 2019 19:02
begin
r = e[property]
r.is_a?(Proc) ? r.call : r
rescue TypeError
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I wonder if there's another way to catch this that doesn't involve a rescue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discussed this with Sam, and e[property] must be called in order to verify its own existence. Otherwise, checking if e has attribute property beforehand requires knowledge of property type (e.g.; if e is an Array and property is a string, then by calling e.[](property) we pass in a string argument, which causes an error).

Copy link
Contributor

@pushrax pushrax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a ton of places in here where user input is indexed by other user input. For instance, compact, uniq, etc.

I don't really want to add a wrapper method for this but it may be better than duplicating

rescue TypeError
# Cannot index with the given property type (eg. indexing integers with strings
# which are only allowed to be indexed by other integers).
raise ArgumentError.new("cannot select the property `#{property}`")
all over the place.

@Thibaut
Copy link
Contributor

Thibaut commented Jan 18, 2019

@pushrax good point. Let's add a wrapper method. Something like this:

get_property(item, property)

def get_property(item, property)
  item[property]
rescue TypeError 
  # Cannot index with the given property type (eg. indexing integers with strings 
  # which are only allowed to be indexed by other integers).
  raise ArgumentError.new("cannot select the property `#{property}`") 
end

Thoughts?

@itsGarland
Copy link
Contributor Author

itsGarland commented Jan 18, 2019

That seems to make sense. Although I noticed that the methods with array inputs often check if the first (or all) element(s) respond_to?(:[]). Would it be reasonable to just take that out entirely, and therefore attempt to get_property? Or would that be considered backwards incompatible and not allowed? If it's better to keep it as it is, would it be better to just standardize and return nil when a property does not exist then?

edit: @Thibaut forgot to tag you.
edit: @pushrax also forgot to tag you :)

@itsGarland
Copy link
Contributor Author

Update: I think raising an ArgumentError might cause a backwards compatibility issue because any template that incorrectly uses map will now break. Also I've noticed some methods such as sort ends up returning nil while other methods such as where prefer returning ArgumentError. Should we standardize a certain format?

@Thibaut @pushrax

@pushrax
Copy link
Contributor

pushrax commented Jan 23, 2019

Any input that can trigger a ruby (non-liquid) exception should be changed to a liquid exception, anything that would previously return nil should continue to return nil unless there's a great reason to change it.

@itsGarland itsGarland changed the title Apply error-checking to map filter Apply error-checking to sort, sort_natural, where, uniq, map, compact filter(s) Jan 23, 2019
@itsGarland
Copy link
Contributor Author

itsGarland commented Jan 23, 2019

some notes I just wanted to make:

  • There is a lack of standardizing between how certain functions return. That is fine as we would not want to make a breaking change by throwing ArgumentError in all circumstances.
  • There is an assumption in some filters that the first element is of the same type (or acts like) the rest of the elements in the input array; therefore assuming that the element can respond to [] implies the remaining elements do, too.

@pushrax when you get a chance please review my changes :)

if property == "to_liquid".freeze
ary
elsif ary.first.respond_to?(:[])
ary.map { |e| (e = get_property(e, property)).is_a?(Proc) ? e.call : e }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we implement this without a double map? This change is making the method slower.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Thibaut I was trying to mirror the format of the other methods. Is the time complexity that much different because it still runs linearly right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bigger problem here is that this allocates an extra array now. This should be done without the extra array.

@@ -160,9 +160,9 @@ def where(input, property, target_value = nil)
if ary.empty?
[]
elsif ary.first.respond_to?(:[]) && target_value.nil?
ary.where_present(property)
ary.select { |item| get_property(item, property) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we move this logic into where / where_present instead of here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As-is, this PR deletes those methods, which I think is fine now that get_property contains the logic.

Copy link
Contributor

@samdoiron samdoiron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be a better way to handle the where logic.

Copy link
Contributor

@samdoiron samdoiron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidental approval.

Copy link
Contributor

@samdoiron samdoiron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@itsGarland itsGarland force-pushed the map_error_checking branch 3 times, most recently from a0b70ff to 199e695 Compare January 25, 2019 21:47
@itsGarland
Copy link
Contributor Author

I've updated my PR to leave map filter as is. Any other comments @pushrax @Thibaut ?

@@ -160,9 +160,9 @@ def where(input, property, target_value = nil)
if ary.empty?
[]
elsif ary.first.respond_to?(:[]) && target_value.nil?
ary.where_present(property)
ary.select { |item| get_property(item, property) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As-is, this PR deletes those methods, which I think is fine now that get_property contains the logic.

@@ -394,6 +394,12 @@ def default(input, default_value = ''.freeze)

private

def get_property(input, property)
input[property]
rescue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you changed this from explicitly rescuing TypeError?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I just forgot. Thanks for catching that 👍

Copy link
Contributor

@pushrax pushrax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect performance will be decently higher for most arrays if the extra method call is not done on each iteration. The approach done for where and where_present before this PR avoided this. How messy would it be to factor the code in this way for all loops that access a property?

@itsGarland
Copy link
Contributor Author

itsGarland commented Jan 29, 2019

@pushrax with the exception of map,

# Remove nils within an array
# provide optional property with which to check for nil
def compact(input, property = nil)
  ary = InputIterator.new(input)

  if property.nil?
    ary.compact
  elsif ary.empty? # The next two cases assume a non-empty array.
    []
  elsif ary.first.respond_to?(:[])
    applesauce(property) { ary.reject{ |a| a[property].nil? } }
  end
end

def applesauce(property)
  yield
rescue TypeError
  raise Liquid::ArgumentError.new("Cannot select the property '#{property}'")
end

how does this sound?

@itsGarland
Copy link
Contributor Author

@pushrax when you get time please tell me what you think

@pushrax
Copy link
Contributor

pushrax commented Feb 5, 2019

With a benchmark 👍 to that idea.

@itsGarland itsGarland force-pushed the map_error_checking branch 2 times, most recently from 6ac0403 to 6f01eae Compare February 5, 2019 23:22
@itsGarland
Copy link
Contributor Author

@pushrax I've made the change. Please review when you're available.

Copy link
Contributor

@pushrax pushrax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost LGTM.

@pushrax
Copy link
Contributor

pushrax commented Feb 14, 2019

jli@tomato ~/s/g/S/liquid git:master ❯❯❯ bx ruby add_rescue.rb
Warming up --------------------------------------
   no safe_property_check     1.264M i/s -      1.283M times in 1.015032s (791.43ns/i)
outer safe_property_check     1.238M i/s -      1.241M times in 1.001754s (807.45ns/i)
inner safe_property_check   868.640k i/s -    897.890k times in 1.033673s (1.15μs/i)
Calculating -------------------------------------
   no safe_property_check     1.317M i/s -      3.791M times in 2.879269s (759.58ns/i)
outer safe_property_check     1.229M i/s -      3.715M times in 3.023683s (813.82ns/i)
inner safe_property_check   872.386k i/s -      2.606M times in 2.987118s (1.15μs/i)

Comparison:
   no safe_property_check:   1316517.8 i/s
outer safe_property_check:   1228770.3 i/s - 1.07x  slower
inner safe_property_check:    872386.0 i/s - 1.51x  slower
require 'benchmark_driver'

Benchmark.driver do |x|
  x.prelude <<~RUBY
    array = ["a"] * 10
    property = "foo"

    def safe_property_check(property)
      yield
    rescue TypeError
      raise
    end
  RUBY

  x.report 'no safe_property_check', <<~RUBY
    array.map { |x| x }
  RUBY

  x.report 'outer safe_property_check', <<~RUBY
    safe_property_check(property) do
      array.map { |x| x }
    end
  RUBY

  x.report 'inner safe_property_check', <<~RUBY
    array.map do |x|
      safe_property_check(property) do
        x
      end
    end
  RUBY
end

@itsGarland
Copy link
Contributor Author

itsGarland commented Feb 15, 2019

I played a bit with benchmark and inlining the rescue behaviour preserves current runtime:

Warming up --------------------------------------
   no safe_property_check     1.348M i/s -      1.396M times in 1.035210s (741.60ns/i)
outer safe_property_check     1.247M i/s -      1.268M times in 1.017346s (802.15ns/i)
inner safe_property_check   789.318k i/s -    820.575k times in 1.039600s (1.27μs/i)
Calculating -------------------------------------
   no safe_property_check     1.379M i/s -      4.045M times in 2.933226s (725.09ns/i)
outer safe_property_check     1.282M i/s -      3.740M times in 2.916987s (779.96ns/i)
inner safe_property_check   797.638k i/s -      2.368M times in 2.968707s (1.25μs/i)

Comparison:
   no safe_property_check:   1379136.1 i/s 
outer safe_property_check:   1282121.2 i/s - 1.08x  slower
inner safe_property_check:    797638.2 i/s - 1.73x  slower

OTT-Garland:test-folder itsgarland$ vi test-file.rb 
OTT-Garland:test-folder itsgarland$ ruby test-file.rb 
Warming up --------------------------------------
   no safe_property_check     1.340M i/s -      1.414M times in 1.054674s (746.14ns/i)
outer safe_property_check     1.270M i/s -      1.272M times in 1.001760s (787.54ns/i)
            inline rescue     1.405M i/s -      1.408M times in 1.002228s (711.58ns/i)
inner safe_property_check   795.751k i/s -    823.020k times in 1.034268s (1.26μs/i)
Calculating -------------------------------------
   no safe_property_check     1.381M i/s -      4.021M times in 2.912349s (724.34ns/i)
outer safe_property_check     1.297M i/s -      3.809M times in 2.935918s (770.72ns/i)
            inline rescue     1.380M i/s -      4.216M times in 3.055957s (724.86ns/i)
inner safe_property_check   799.202k i/s -      2.387M times in 2.987046s (1.25μs/i)

Comparison:
   no safe_property_check:   1380564.3 i/s 
            inline rescue:   1379582.2 i/s - 1.00x  slower
outer safe_property_check:   1297488.2 i/s - 1.06x  slower
inner safe_property_check:    799202.0 i/s - 1.73x  slower

of course that literally comes back to the original issue of duplicating logic everywhere. However since it is slightly slower using a method, would it just make sense to inline the rescue blocks anyway? From what I can tell, e[property] is only used as a return value, for comparison, or for type checking. I think then, there would unlikely be a future change in error handling (let alone functionality)...so the duplicated logic won't change that often. What are your thoughts?

edit: tagging @pushrax @Thibaut

@pushrax
Copy link
Contributor

pushrax commented Feb 15, 2019

Adding ~50ns overhead is probably okay for most real cases here, but it's not really much more code to do something like

def sort(input, property)
  ...
  elsif ary.all? { |el| el.respond_to?(:[]) }
    begin
      ary.sort { |a, b| nil_safe_compare(a[property], b[property]) }
    rescue TypeError
      raise_property_error(property)
    end
  end
end

Usually I wouldn't care so much, but I wouldn't be surprised if some Liquid templates use these filters in tight loops.

@itsGarland
Copy link
Contributor Author

@pushrax I rewrote it with just inlining the rescue blocks just to see what it looks like. It's messy but it does the job better since it performs faster..

does this look better or did I misunderstand your previous comment?

@pushrax
Copy link
Contributor

pushrax commented Feb 19, 2019

You could abstract the raise itself into a method, since that's the rare case.

@itsGarland
Copy link
Contributor Author

@pushrax

Comparison:
                            no check:   1469819.0 i/s 
                        inline raise:   1466718.7 i/s - 1.00x  slower
inline begin block but extract raise:   1455266.1 i/s - 1.01x  slower

This is a lot better than before 👍.

@itsGarland itsGarland force-pushed the map_error_checking branch 4 times, most recently from 11b53a5 to ca1f972 Compare February 21, 2019 18:48

Verified

This commit was signed with the committer’s verified signature.
bricker Bryan Ricker
@itsGarland
Copy link
Contributor Author

eZ Pz.

@itsGarland itsGarland merged commit 7a81fb8 into master Feb 22, 2019
@Thibaut Thibaut deleted the map_error_checking branch February 22, 2019 16:01
@shopify-shipit shopify-shipit bot temporarily deployed to rubygems-4-0-stable January 11, 2023 15:52 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants