gh-117349: Micro-optimize a few `os.path` functions #117350

nineteendo · 2024-03-28T20:49:18Z

Benchmarks

ntpath.py

script

# TODO: test isjunction() on Windows
echo "isreserved()" && python -m timeit -s "import before.ntpath" "before.ntpath.isreserved('con')" && python -m timeit -s "import after.ntpath" "after.ntpath.isreserved('con')"
echo "isreserved('.')" && python -m timeit -s "import before.ntpath" "before.ntpath.isreserved('.')" && python -m timeit -s "import after.ntpath" "after.ntpath.isreserved('.')"
echo "expanduser()" && python -m timeit -s "import before.ntpath" "before.ntpath.expanduser('~')" && python -m timeit -s "import after.ntpath" "after.ntpath.expanduser('~')"
echo "realpath()"; python -m timeit -s "import test" "test.realpath1('.')"; python -m timeit -s "import test" "test.realpath2('.')"
echo "realpath('nul')"; python -m timeit -s "import test" "test.realpath1('nul')"; python -m timeit -s "import test" "test.realpath2('nul')"

isreserved()
200000 loops, best of 5: 1.66 usec per loop # before
200000 loops, best of 5: 1.65 usec per loop # after
# -> no difference
isreserved('.')
200000 loops, best of 5: 1.31 usec per loop # before
500000 loops, best of 5: 966 nsec per loop # after
# -> 1.36x faster (for . & ..)
expanduser()
200000 loops, best of 5: 1.72 usec per loop # before
200000 loops, best of 5: 1.72 usec per loop # after
# -> no difference
realpath()
2000 loops, best of 5: 132 usec per loop # before
2000 loops, best of 5: 131 usec per loop # after
# -> no difference
realpath('nul')
200000 loops, best of 5: 1.26 usec per loop # before
500000 loops, best of 5: 907 nsec per loop # after
# -> 1.39x faster (for nul)

posixpath.py

script

# test.sh
echo "ismount()" && python -m timeit -s "import before.posixpath" "before.posixpath.ismount('/Volumes/2GB_001')" && python -m timeit -s "import after.posixpath" "after.posixpath.ismount('/Volumes/2GB_001')"
echo "expanduser()" && python -m timeit -s "import before.posixpath" "before.posixpath.expanduser('~')" && python -m timeit -s "import after.posixpath" "after.posixpath.expanduser('~')"
echo "expanduser(b'~root')" && python -m timeit -s "import before.posixpath" "before.posixpath.expanduser(b'~root')" && python -m timeit -s "import after.posixpath" "after.posixpath.expanduser(b'~root')"
echo "_normpath_fallback()" && python -m timeit -s "import before.posixpath" "before.posixpath._normpath_fallback('foo//bar')" && python -m timeit -s "import after.posixpath" "after.posixpath._normpath_fallback('foo//bar')"
echo "abspath()" && python -m timeit -s "import before.posixpath" "before.posixpath.abspath('foo')" && python -m timeit -s "import after.posixpath" "after.posixpath.abspath('foo')"
echo "abspath('/foo')" && python -m timeit -s "import before.posixpath" "before.posixpath.abspath('/foo')" && python -m timeit -s "import after.posixpath" "after.posixpath.abspath('/foo')"
echo "realpath()" && python -m timeit -s "import before.posixpath" "before.posixpath.realpath('foo/../../..')" && python -m timeit -s "import after.posixpath" "after.posixpath.realpath('foo/../../..')"

ismount()
10000 loops, best of 5: 20.3 usec per loop # before
10000 loops, best of 5: 19.3 usec per loop # after
# -> 1.05x faster
expanduser()
200000 loops, best of 5: 1.43 usec per loop # before
200000 loops, best of 5: 1.42 usec per loop # after
# -> no difference
expanduser(b'~root')
200000 loops, best of 5: 1.82 usec per loop # before
200000 loops, best of 5: 1.75 usec per loop # after
# -> 1.04x faster (for byte users)
_normpath_fallback()
200000 loops, best of 5: 1.07 usec per loop # before
500000 loops, best of 5: 953 nsec per loop # after
# -> 1.13x faster
abspath()
20000 loops, best of 5: 16.7 usec per loop # before
20000 loops, best of 5: 16.5 usec per loop # after
# -> no difference
abspath('/foo')
500000 loops, best of 5: 509 nsec per loop # before
500000 loops, best of 5: 423 nsec per loop # after
# -> 1.20x faster (for absolute paths)
realpath()
10000 loops, best of 5: 24.4 usec per loop # before
10000 loops, best of 5: 23.9 usec per loop # after
# -> 1.02x faster

Issue: Speed up os.path #117349

nineteendo · 2024-03-29T15:09:30Z

I believe that's everything. If you would like these changes to be split up in multiple pull requests let me know.

Lib/genericpath.py

AlexWaygood

As @sobolevn says, it's very difficult to see here which changes are actually related to performance improvements, and which are simply cosmetic changes that have no impact on how the code works. Our general policy is not to accept cosmetic changes, but even if we decided that we wanted to make an exception here, any such stylistic/formatting improvements would have to go into their own PR, so that the git history clearly showed which changes were performance-related and which were cosmetic.

Please revert all changes that do not actually have any impact on performance.

bedevere-app · 2024-03-29T17:13:19Z

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

Lib/ntpath.py

nineteendo · 2024-03-29T17:31:07Z

Please revert all changes that do not actually have any impact on performance.

Does that include unnesting?

nineteendo · 2024-04-01T18:15:17Z

Is there anything you still expect from me? Or can this finally be merged? I've listened to basically all the feedback.
If you want to wait for a benchmark from me, that's understandable of course.

nineteendo · 2024-04-01T18:24:13Z

The speed improvements seem way less impressive now. ;(
And I that's even without running my benchmark.

nineteendo · 2024-04-01T18:31:47Z

Are you happy now?

Lib/ntpath.py

AlexWaygood · 2024-04-02T09:55:42Z

I re-ran your benchmarks locally. With the latest version of your PR, I'm getting a reasonable slowdown on posixpath.ismount() from your PR branch.

On main:

(main) % ./python.exe -m timeit -s "import posixpath" "posixpath.ismount('/Volumes/2GB_001')"                                 ~/dev/cpython
500000 loops, best of 5: 602 nsec per loop
(main) % ./python.exe -m timeit -s "import posixpath" "posixpath.ismount('/Volumes/2GB_001')"                                 ~/dev/cpython
500000 loops, best of 5: 603 nsec per loop
(main) % ./python.exe -m timeit -s "import posixpath" "posixpath.ismount('/Volumes/2GB_001')"                                 ~/dev/cpython
500000 loops, best of 5: 607 nsec per loop
(main) % ./python.exe -m timeit -s "import posixpath" "posixpath.ismount('/Volumes/2GB_001')"                                 ~/dev/cpython
500000 loops, best of 5: 601 nsec per loop

With your PR branch:

(speedup-os.path) % ./python.exe -m timeit -s "import posixpath" "posixpath.ismount('/Volumes/2GB_001')"                      ~/dev/cpython
500000 loops, best of 5: 637 nsec per loop
(speedup-os.path) % ./python.exe -m timeit -s "import posixpath" "posixpath.ismount('/Volumes/2GB_001')"                      ~/dev/cpython
500000 loops, best of 5: 650 nsec per loop
(speedup-os.path) % ./python.exe -m timeit -s "import posixpath" "posixpath.ismount('/Volumes/2GB_001')"                      ~/dev/cpython
500000 loops, best of 5: 646 nsec per loop
(speedup-os.path) % ./python.exe -m timeit -s "import posixpath" "posixpath.ismount('/Volumes/2GB_001')"                      ~/dev/cpython
500000 loops, best of 5: 643 nsec per loop

nineteendo · 2024-04-02T10:21:38Z

With the latest version of your PR, I'm getting a reasonable slowdown on posixpath.ismount() from your PR branch.

You don't have a usb stick "2GB_001" plugged in, so the function will return False immediately after s1 = os.lstat(path).
I believe something has been sped up on the main branch, so I just synced my branch.

AlexWaygood · 2024-04-02T10:26:35Z

You don't have a usb stick "2GB_001" plugged in, so the function will return False immediately after s1 = os.lstat(path).

heh, that makes sense

AlexWaygood

LGTM

nineteendo · 2024-04-02T11:35:26Z

I've updated the benchmarks. Sadly, the only noticeable speedup for regular users is calling posixpath.abspath() with an absolute path.

AlexWaygood · 2024-04-02T11:44:47Z

I've updated the benchmarks. Sadly, the only noticeable speedup for regular users is calling posixpath.abspath() with an absolute path.

Don't be too disheartened. As a result of your efforts here, a conversation has been started about possible ways of optimising str.startswith and other string methods. If one or more of those ideas comes to fruition, that will be very impactful for Python users.

Optimising the stdlib is hard! I tried out many things that ultimately didn't work when I was working on #74690. And there were several things that did work, but which I never created PRs for, as they would have made the code too ugly or too fragile.

My advice for the future, however, would definitely be to work on small, focused PRs that have isolated, easily measurable changes. PRs like that are much easier for us to review, and you should find the contributing experience less frustrating as a result.

nineteendo · 2024-04-02T18:42:05Z

Can this be merged, or are we still waiting on something?

AlexWaygood · 2024-04-02T19:15:33Z

Can this be merged, or are we still waiting on something?

It can be merged, I just wanted to wait a little bit longer to give the other reviewers time to chime in on the final version of the PR, if they wanted to. I'll merge tomorrow or the day after if there are no further objections from anybody.

Lib/ntpath.py

Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>

… into speedup-os.path

AlexWaygood · 2024-04-02T20:33:24Z

Thanks @nineteendo

serhiy-storchaka

LGTM.

Lib/ntpath.py

Lib/posixpath.py

erlend-aasland · 2024-04-03T12:25:23Z

I believe it should indeed be possible to speed up str.startswith(prefix), but not str.startswith((prefix1, prefix2)), because there's not guarantee that the prefixes have the same length. I'll revert the optimisations for the first case.

Both cases were significantly improved by #117466:

# pre optimisation
$ ./python.exe -m timeit -s "s = 'abcdef'" "s.startswith(('abc', 'de'))"
5000000 loops, best of 5: 89 nsec per loop

# post optimisation
$ ./python.exe -m timeit -s "s = 'abcdef'" "s.startswith(('abc', 'de'))"
10000000 loops, best of 5: 26.7 nsec per loop

I think we should be careful about micro-optimising Python code, like this PR. Instead, it would be better to keep the code as idiomatic as possible, and instead optimise the Python interpreter for the idiomatic cases.

Lib/ntpath.py

AlexWaygood · 2024-04-03T12:36:07Z

I think we should be careful about micro-optimising Python code, like this PR. Instead, it would be better to keep the code as idiomatic as possible, and instead optimise the Python interpreter for the idiomatic cases.

I agree that we should neither accept micro-optimisations that make code significantly less idiomatic, nor changes that are purely cosmetic and have no impact on performance. I accepted the PR nonetheless because in the final iteration of the PR, all changes seemed to me to be small, localised changes that, as well as providing small speedups, mostly made the code more idiomatic, and in all cases (in my opinion) did not cause a significant deterioration in style.

nineteendo · 2024-04-03T17:21:33Z

Both cases were significantly improved by #117466

I didn't expect the second case to be sped up as it was a sequence, so I asked for clarification: faster-cpython/ideas#671 (comment), but didn't get a response.

This speedup must have been added afterwards, which I missed.
I obviously reverted the regular str.startswith() & str.endswith().

) Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com> Co-authored-by: Barney Gale <barney.gale@gmail.com> Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>

nineteendo added 3 commits March 28, 2024 19:22

Speed up posixpath.ismount

c90a883

Speed up posixpath.expanduser

5984959

Speed up posixpath.normpath

833ddc9

bedevere-app bot mentioned this pull request Mar 28, 2024

Speed up os.path #117349

Closed

nineteendo mentioned this pull request Mar 28, 2024

gh-117201: Handle leading // for posixpath.commonpath #117202

Closed

eryksun requested a review from serhiy-storchaka March 28, 2024 20:54

Refactor posixpath.expandvars & ntpath.commonpath

a4d9fcb

nineteendo mentioned this pull request Mar 29, 2024

Improve os #117361

Closed

16 tasks

nineteendo and others added 9 commits March 29, 2024 10:56

Remove start- & endswith

b853d4d

Remove startswith

a8984dc

Remove isabs calls

78929b0

Rename result of splitroot

afc7bbc

Refactor os.path

f5bbaf6

fix unbound variable

e740f10

hardcode constants like documented

d02c726

📜🤖 Added by blurb_it.

b794897

Fix typo

c2c04bf

nineteendo marked this pull request as ready for review March 29, 2024 15:09

bedevere-app bot added the awaiting review label Mar 29, 2024

nineteendo mentioned this pull request Mar 29, 2024

Handle leading // for posixpath.realpath #117338

Closed

sobolevn requested a review from barneygale March 29, 2024 15:37

sobolevn reviewed Mar 29, 2024

View reviewed changes

Lib/genericpath.py Outdated Show resolved Hide resolved

AlexWaygood requested changes Mar 29, 2024

View reviewed changes

bedevere-app bot removed the awaiting review label Mar 29, 2024

bedevere-app bot added the awaiting changes label Mar 29, 2024

exclude stylistic-only changes

7c9dcae

barneygale reviewed Mar 29, 2024

View reviewed changes

Lib/ntpath.py Outdated Show resolved Hide resolved

Revert len() call

e637698

;(

b5fdd27

eendebakpt reviewed Apr 1, 2024

View reviewed changes

Lib/ntpath.py Show resolved Hide resolved

Merge branch 'main' into speedup-os.path

c923e49

AlexWaygood approved these changes Apr 2, 2024

View reviewed changes

eendebakpt approved these changes Apr 2, 2024

View reviewed changes

Lib/ntpath.py Outdated Show resolved Hide resolved

nineteendo and others added 2 commits April 2, 2024 21:46

Replace definition of colon

887f3c1

Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>

Merge branch 'speedup-os.path' of https://github.com/nineteendo/cpython…

e3ace2f

… into speedup-os.path

barneygale approved these changes Apr 2, 2024

View reviewed changes

AlexWaygood changed the title ~~gh-117349: Speedup os.path~~ gh-117349: Micro-optimize a few os.path functions Apr 2, 2024

AlexWaygood merged commit cae4cdd into python:main Apr 2, 2024
33 checks passed

bedevere-app bot removed the awaiting merge label Apr 2, 2024

nineteendo deleted the speedup-os.path branch April 2, 2024 20:34

serhiy-storchaka approved these changes Apr 2, 2024

View reviewed changes

Lib/ntpath.py Show resolved Hide resolved

Lib/posixpath.py Show resolved Hide resolved

erlend-aasland reviewed Apr 3, 2024

View reviewed changes

Lib/ntpath.py Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-117349: Micro-optimize a few `os.path` functions #117350

gh-117349: Micro-optimize a few `os.path` functions #117350

nineteendo commented Mar 28, 2024 •

edited

nineteendo commented Mar 29, 2024

AlexWaygood left a comment

bedevere-app bot commented Mar 29, 2024

nineteendo commented Mar 29, 2024

nineteendo commented Apr 1, 2024

nineteendo commented Apr 1, 2024

nineteendo commented Apr 1, 2024

AlexWaygood commented Apr 2, 2024

nineteendo commented Apr 2, 2024

AlexWaygood commented Apr 2, 2024

AlexWaygood left a comment

nineteendo commented Apr 2, 2024

AlexWaygood commented Apr 2, 2024

nineteendo commented Apr 2, 2024

AlexWaygood commented Apr 2, 2024

AlexWaygood commented Apr 2, 2024

serhiy-storchaka left a comment

erlend-aasland commented Apr 3, 2024

AlexWaygood commented Apr 3, 2024

nineteendo commented Apr 3, 2024 •

edited

gh-117349: Micro-optimize a few os.path functions #117350

gh-117349: Micro-optimize a few os.path functions #117350

Conversation

nineteendo commented Mar 28, 2024 • edited

Benchmarks

ntpath.py

posixpath.py

nineteendo commented Mar 29, 2024

AlexWaygood left a comment

Choose a reason for hiding this comment

bedevere-app bot commented Mar 29, 2024

nineteendo commented Mar 29, 2024

nineteendo commented Apr 1, 2024

nineteendo commented Apr 1, 2024

nineteendo commented Apr 1, 2024

AlexWaygood commented Apr 2, 2024

nineteendo commented Apr 2, 2024

AlexWaygood commented Apr 2, 2024

AlexWaygood left a comment

Choose a reason for hiding this comment

nineteendo commented Apr 2, 2024

AlexWaygood commented Apr 2, 2024

nineteendo commented Apr 2, 2024

AlexWaygood commented Apr 2, 2024

AlexWaygood commented Apr 2, 2024

serhiy-storchaka left a comment

Choose a reason for hiding this comment

erlend-aasland commented Apr 3, 2024

AlexWaygood commented Apr 3, 2024

nineteendo commented Apr 3, 2024 • edited

gh-117349: Micro-optimize a few `os.path` functions #117350

gh-117349: Micro-optimize a few `os.path` functions #117350

nineteendo commented Mar 28, 2024 •

edited

nineteendo commented Apr 3, 2024 •

edited