Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize ImageStat.Stat.extrema #7593

Merged
merged 5 commits into from Dec 6, 2023

Conversation

florath
Copy link
Contributor

@florath florath commented Dec 1, 2023

The optimized function improves the performance. The original function always runs through the complete histogram of length 256 even if it is possible to exit the loop early (break).

Running performance tests improvements of factor >10 were measured depending on the images under test.

Changes proposed in this pull request:

  • Performance optimized ImageStat _getextrema function

The optimzed function improves the performance.  The original function
always runs through the complete historgram of length 256 even if it
is possible to exit the loop early (break).

Running some tests I found performance improvements of factor >10
depending on the image.

Signed-off-by: Andreas Florath <andreas@florath.net>
Signed-off-by: Andreas Florath <andreas@florath.net>
@florath
Copy link
Contributor Author

florath commented Dec 1, 2023

Hmmm.
The failing test case looks completely unrelated to the patch. Any idea?

@hugovk
Copy link
Member

hugovk commented Dec 1, 2023

I restarted the AppVeyor job, let's see how it goes. Also approved the GitHub Actions to run.

Please could you share the performance tests you ran and their results?

@radarhere
Copy link
Member

Yes, the test failures are also happening in main. I've created #7594 to fix them.

@florath
Copy link
Contributor Author

florath commented Dec 2, 2023

Please could you share the performance tests you ran and their results?

I created a virtualenv with pillow and added the new function side by side with the original, like:

    def _getextrema_orig(self):
        """Get min/max values for each band in the image"""

        def minmax(histogram):
            n = 255
            x = 0
            for i in range(256):
                if histogram[i]:
                    n = min(n, i)
                    x = max(x, i)
            return n, x  # returns (255, 0) if there's no data in the histogram

        v = []
        for i in range(0, len(self.h), 256):
            v.append(minmax(self.h[i:]))
        return v

    def _getextrema(self):
        """Get min/max values for each band in the image"""

        def minmax(histogram):
            res_min, res_max = 255, 0
            for i in range(256):
                if histogram[i]:
                    res_min = i
                    break
            for i in range(255, -1, -1):
                if histogram[i]:
                    res_max = i
                    break
            if res_max >= res_min:
                return res_min, res_max
            else:
                return (255, 0)

        v = []
        for i in range(0, len(self.h), 256):
            v.append(minmax(self.h[i:i+256]))
        return v

Then I used a collection of images (ImageNet) and run a script to compare the original and the optimized function. I adapted the script slightly so that you can run the test also on the Pillow test images (have to add the try/except block here because there are some broken images in the Pillow test image directory). BTW: this also checks if the original and optimized versions return the same results.

import pathlib
import timeit
from PIL import Image, ImageStat

IMAGEDIR="../Pillow/Tests/images"

testdir = pathlib.Path(IMAGEDIR)

NUMBER=1000
REPEAT=10

for image_file_name in testdir.rglob("*"):
    # Skip broken images
    try:
        img = Image.open(image_file_name)
        stat = ImageStat.Stat(img)
    except Exception as ex:
        continue

    # Check for correctness
    res_orig = stat._getextrema_orig()
    res_opt = stat._getextrema()
    assert res_orig == res_opt

    # Measure improvement factor
    exec_times_orig = timeit.repeat(
        stmt=stat._getextrema_orig, repeat=REPEAT, number=NUMBER)
    exec_times_opt = timeit.repeat(
        stmt=stat._getextrema, repeat=REPEAT, number=NUMBER)

    print("%10.4f - %s" % (
        min(exec_times_orig) / min(exec_times_opt), image_file_name))

For the ImageNet about 80-90% of the images had speedup factor of more than 10. A lot even more than factor 40.
For the Pillow test images I get the following output (partial).

    3.3938 - ../Pillow/Tests/images/itxt_chunks.png
   32.8581 - ../Pillow/Tests/images/tiff_wrong_bits_per_sample_3.tiff
   49.2568 - ../Pillow/Tests/images/hopper.iccprofile.tif
   24.9033 - ../Pillow/Tests/images/mmap_error.bmp
   47.1822 - ../Pillow/Tests/images/hopper.dds
   12.4054 - ../Pillow/Tests/images/uncompressed_rgb.png
    3.3469 - ../Pillow/Tests/images/imagedraw_polygon_kite_L.png
   12.8065 - ../Pillow/Tests/images/colr_bungee.png
    1.8857 - ../Pillow/Tests/images/imagedraw_rounded_rectangle_corners_yyny.png
   33.7577 - ../Pillow/Tests/images/clipboard_target.png
   11.1593 - ../Pillow/Tests/images/bc5s.png
   47.2684 - ../Pillow/Tests/images/hopper.sgi
   29.2110 - ../Pillow/Tests/images/pil123rgba.qoi
    5.5825 - ../Pillow/Tests/images/imagedraw_polygon_kite_RGB.png
   22.7400 - ../Pillow/Tests/images/dispose_bgnd_transparency.gif
    9.0826 - ../Pillow/Tests/images/palette_sepia.png
    1.8943 - ../Pillow/Tests/images/imagedraw_rounded_rectangle_corners_ynyn.png
   45.2440 - ../Pillow/Tests/images/balloon.jpf
    1.0019 - ../Pillow/Tests/images/no_palette_with_transparency.gif
   27.4538 - ../Pillow/Tests/images/imagedraw2_text.png
   47.9036 - ../Pillow/Tests/images/test_anchor_multiline_mm_right.png
   47.4315 - ../Pillow/Tests/images/hopper_orientation_2.webp

Please note that the first number is the speedup-factor: the factor how much faster the proposed function is measured against the original.

src/PIL/ImageStat.py Outdated Show resolved Hide resolved
Simplification of return statement

Co-authored-by: Andrew Murray <3112309+radarhere@users.noreply.github.com>
@florath
Copy link
Contributor Author

florath commented Dec 4, 2023

Thanks a lot for the hint!
In deed, the additional check is not needed. (I tested some versions of the function during the optimization process and an earlier needed the check - but this one can just return the values.)


v = []
for i in range(0, len(self.h), 256):
v.append(minmax(self.h[i:]))
v.append(minmax(self.h[i : i + 256]))
return v
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can also use list comprehension

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea was to have a minimal patch set.

If you want, we could go with

return [minmax(self.h[i:i+256]) for i in range(0, len(self.h), 256)]

My tests showed that from the performance point of view the two solutions are mostly the same. But if my other patch gets accepted #7599 the list comprehension might be the way to go to have the same style.

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#7597 has actually just been merged to use list comprehension here. Would you like to resolve the conflict?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

@hugovk hugovk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@hugovk
Copy link
Member

hugovk commented Dec 5, 2023

Let's also mention this and the other one in the release notes (there'll be a template once #6486 is merged). Can be in a followup PR.

@radarhere radarhere merged commit e9afaee into python-pillow:main Dec 6, 2023
56 checks passed
@radarhere radarhere changed the title Optimize ImageStat.Stat._getextrema function Optimize ImageStat.Stat.extrema Dec 6, 2023
radarhere added a commit to radarhere/Pillow that referenced this pull request Dec 6, 2023
@radarhere
Copy link
Member

I've created #7605 to include this in the release notes.

radarhere added a commit to radarhere/Pillow that referenced this pull request Dec 7, 2023
radarhere added a commit to radarhere/Pillow that referenced this pull request Dec 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants