-
Notifications
You must be signed in to change notification settings - Fork 144
[TASK] Avoid poor scaling of array_search()
with very long arrays
#413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TASK] Avoid poor scaling of array_search()
with very long arrays
#413
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bartbutler, thanks for the optimization <3
Quite a complicated bit of code to understand, but I think I'm satisfied that it's functionally equivalent before and after this change, and that it indeed should now run in more like O(n) time than O(n2).
$aStack
is an odd-length array containing an alternating series of values and delimiters (perhaps operators). The order in which the delimiters are processed is important, as it represents precedence, so the outer loop is necessary.
I wondered whether the inner loop could start at position 1 and step by 2, so it only looked at delimiters. In that case it would need to copy the previous value and the delimiter to the new stack, and special-case the final delimiter to also copy the final value (if the delimiter doesn't match). Might be slightly more optimal and marginally clearer.
Otherwise, ++
can be used instead of += 1
.
Also, I think this change is worthy of a changelog entry.
src/Value/Value.php
Outdated
if (count($aStack) === 1) { | ||
return $aStack[0]; | ||
} | ||
$iStartPosition = null; | ||
while (($iStartPosition = array_search($sDelimiter, $aStack, true)) !== false) { | ||
$aNewStack = []; | ||
$aStackLength = count($aStack); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$aStackLength
can be initialized before and used in the preceding early-return test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And maybe should be named $iStackLength
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, good catch. I had forgotten the variable naming scheme.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That variable naming scheme may be changed in future - #450 (comment) - though until such time it's best to maintain consistency.
src/Value/Value.php
Outdated
$iStackLength = count($aStack); | ||
foreach ($aListDelimiters as $sDelimiter) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think $iStackLength
needs to be set at the start of the outer loop, not before it. The stack is updated at the end of the outer loop with $aStack = $aNewStack
, and may now be shorter for the next iteration if some elements could be joined for the specific delimiter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, you're right. Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look fine to me now.
@oliverklee, @sabberworm, do you have anything to add? Also wondered if there should be a CHANGELOG.md
entry - it's not a change in functioality, just a performance improvement.
I don't have anything to add. And yes, I a changelog entry would be good for this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @bartbutler,
Thanks for making the suggested code changes.
Would you also be able to add an entry to CHANGELOG.md
(in the root directory), briefly describing the change. It could probably either be in the 'Fixed' or 'Changed' sections, whichever you think is more appropraite. New entries should be added at the top of the list.
(I'm guessing the slowdown would have resulted from CSS property values with long comma-sparated lists, such as box-shadow
, or perhaps gradient fills, though would be intrigued to know what was in the 1.6Mb stylesheet specifically causing the slowdown, if indeed you do - I appreciate that using a profiler may not have provided that info.)
Co-authored-by: JakeQZ <jake@qzdesign.co.uk>
1484ee6
to
fd33a67
Compare
I thought that I had added the example as a test but I guess I reconsidered, probably because it came into my possession as an email so there were privacy implications. IIRC it was someone who used 1.6MB 1px elements attached to rgb() functions to recreate a small photo using pure CSS. |
array_search()
with very long arrays
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC it was someone who used 1.6MB 1px elements attached to rgb() functions to recreate a small photo using pure CSS.
Wow!
Changes all look good now. Thanks for contributing <3
…413) When there were many many elements in `$aStack`, starting the delimiter search from the beginning for each loop iteration was very slow. This is addressed by building a new array, rather than modifying `$aStack` in place, and iterating over it in a single pass. A particular 1.6M style string is now parsed in 5 seconds rather than 4 minutes.
(#413) Co-authored-by: Jake Hotson <jake.github@qzdesign.co.uk>
When there are many many elements in $aStack, starting the delimiter search from the beginning for each loop is very very slow. I addressed this by building a new array rather than modifying $aStack in place and iterating over it in a single pass. The particular 1.6M style string that was giving me trouble went from taking 4 minutes to parse to 5 seconds.