Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature optimize node scope resolver test #3440

Merged

Conversation

schlndh
Copy link
Contributor

@schlndh schlndh commented Sep 14, 2024

This is a different take on #3163. I fixed the main issue you mentioned: You can see all the issues from one file at once (instead of just the first one).

My motivation for this change is to speed up make tests. On my laptop with 8C/16T the results of time make tests are as follows:

1.12.x:

real	1m12.145s
user	3m56.359s
sys	0m4.734s

My branch:

real	0m44.490s
user	3m25.950s
sys	0m4.586s

I measured the NodeScopeResolverTest.php separately and it takes roughly 25s for me (almost all of that is spent in gatherAssertTypes). You can see that the different in the user time is also roughly 25s. I guess that's because in 1.12.x the test effectively runs twice. All the work is done in the data provider, which first has to run in paratest's main process, before being run again in a worker process.

Sadly, the individual test cases are still not run in parallel.

@herndlm
Copy link
Contributor

herndlm commented Sep 14, 2024

Interesting. Reminds me a bit of what I tried to do in #1991 once but unfortunately never finished

@ondrejmirtes
Copy link
Member

@schlndh Can you please explain what this PR changes and why is it faster?

I guess that's because in 1.12.x the test effectively runs twice.

Why does it run twice?

@schlndh
Copy link
Contributor Author

schlndh commented Sep 14, 2024

@ondrejmirtes The important change is moving gatherAssertTypes (i.e. most of the work) away from the data provider and into the test method. This helps for parastan (and shouldn't affect phpunit).

The reason it helps parastan is as follows (I'm not familiar with internals of either phpunit or parastan. I'm basing this on a mixture of debugging, reading the source code and guessing):

  1. Parastan first loads the test suites in the main process. This involves executing the data providers (SuiteLoader::getMethodTests).
  2. The tests are executed in worker processes. The command "executed" by parastan can be seen with the help of --debug flag. For NodeScopeResolverTest it looks like this:
Executing test via: '/home/schlndh/devel/custom/phpstan-src/vendor/phpunit/phpunit/phpunit' '--configuration' '/home/schlndh/devel/custom/phpstan-src/phpunit.xml' '--no-logging' '--no-coverage' '--printer' 'ParaTest\Runners\PHPUnit\Worker\NullPhpunitPrinter' '--log-junit' '/tmp/PT_gWfIsa' '/home/schlndh/devel/custom/phpstan-src/tests/PHPStan/Analyser/NodeScopeResolverTest.php'
  1. The command is read by phpunit-wrapper.php, which looks roughly equivalent to running it from the command line. So it stands to reason that the data provider runs again in the worker process, because that's what PHPUnit does. I verified that it does run twice by adding file_put_contents('/tmp/nsrt_pid', sprintf("NSRT data provider run at %s, from %d\n", date('H:i:s'), getmypid()), FILE_APPEND); to it, running the test via paratest and checking the results (2 lines, different PIDs).

EDIT: Based on SuiteLoader::getMethodTests, I guess the reason why parastan executes the data provider in the first step is to support --filter. So make tests could probably be optimized directly in parastan. But this change would still be nice for running/debugging a single test from PHPStorm/command line. #1991 mentions code coverage as well (I haven't checked how that works).

@ondrejmirtes
Copy link
Member

Alright, that makes sense. And for the 2nd part - how are you achieving that all failures from a single file are reported at once?

@schlndh
Copy link
Contributor Author

schlndh commented Sep 15, 2024

@ondrejmirtes I copied the assertFileAsserts implementation and changed it so that it collects the failures in an array instead of reporting them immediatelly. I report all the failures at once via self::fail(...) at the end of the test (if necessary).

@ondrejmirtes
Copy link
Member

On my computer make tests got faster from about 58s to 40s, which is nice!

The output now looks like this:

1) PHPStan\Analyser\NodeScopeResolverTest::testFile with data set "tests/PHPStan/Analyser/nsrt/array-chunk.php" ('/Users/ondrej/Development/php...nk.php')
Failed assertions in /Users/ondrej/Development/phpstan/tests/PHPStan/Analyser/nsrt/array-chunk.php:

Expected type list<non-empty-list<mixed>, got type list<non-empty-list<mixed>> on line 13.
Expected type array{array{0, 1}, array{2}, got type array{array{0, 1}, array{2}} on line 29.

/Users/ondrej/Development/phpstan/tests/PHPStan/Analyser/NodeScopeResolverTest.php:257

2) PHPStan\Analyser\NodeScopeResolverTest::testFile with data set "tests/PHPStan/Analyser/nsrt/array-flip.php" ('/Users/ondrej/Development/php...ip.php')
Failed assertions in /Users/ondrej/Development/phpstan/tests/PHPStan/Analyser/nsrt/array-flip.php:

Expected type array<int|string, (int|strin)>, got type array<int|string, (int|string)> on line 23.

/Users/ondrej/Development/phpstan/tests/PHPStan/Analyser/NodeScopeResolverTest.php:257

Previously it looked like this:

1) PHPStan\Analyser\NodeScopeResolverTest::testFileAsserts with data set "/Users/ondrej/Development/phpstan/tests/PHPStan/Analyser/nsrt/array-chunk.php:13" ('type', '/Users/ondrej/Development/php...nk.php', 'list<non-empty-list<mixed>', 'list<non-empty-list<mixed>>', 13)
Expected type list<non-empty-list<mixed>, got type list<non-empty-list<mixed>> in /Users/ondrej/Development/phpstan/tests/PHPStan/Analyser/nsrt/array-chunk.php on line 13.
Failed asserting that two strings are identical.
--- Expected
+++ Actual
@@ @@
-'list<non-empty-list<mixed>'
+'list<non-empty-list<mixed>>'

/Users/ondrej/Development/phpstan/src/Testing/TypeInferenceTestCase.php:127
/Users/ondrej/Development/phpstan/tests/PHPStan/Analyser/NodeScopeResolverTest.php:202

2) PHPStan\Analyser\NodeScopeResolverTest::testFileAsserts with data set "/Users/ondrej/Development/phpstan/tests/PHPStan/Analyser/nsrt/array-chunk.php:29" ('type', '/Users/ondrej/Development/php...nk.php', 'array{array{0, 1}, array{2}', 'array{array{0, 1}, array{2}}', 29)
Expected type array{array{0, 1}, array{2}, got type array{array{0, 1}, array{2}} in /Users/ondrej/Development/phpstan/tests/PHPStan/Analyser/nsrt/array-chunk.php on line 29.
Failed asserting that two strings are identical.
--- Expected
+++ Actual
@@ @@
-'array{array{0, 1}, array{2}'
+'array{array{0, 1}, array{2}}'

/Users/ondrej/Development/phpstan/src/Testing/TypeInferenceTestCase.php:127
/Users/ondrej/Development/phpstan/tests/PHPStan/Analyser/NodeScopeResolverTest.php:202

3) PHPStan\Analyser\NodeScopeResolverTest::testFileAsserts with data set "/Users/ondrej/Development/phpstan/tests/PHPStan/Analyser/nsrt/array-flip.php:23" ('type', '/Users/ondrej/Development/php...ip.php', 'array<int|string, (int|strin)>', 'array<int|string, (int|string)>', 23)
Expected type array<int|string, (int|strin)>, got type array<int|string, (int|string)> in /Users/ondrej/Development/phpstan/tests/PHPStan/Analyser/nsrt/array-flip.php on line 23.
Failed asserting that two strings are identical.
--- Expected
+++ Actual
@@ @@
-'array<int|string, (int|strin)>'
+'array<int|string, (int|string)>'

/Users/ondrej/Development/phpstan/src/Testing/TypeInferenceTestCase.php:127
/Users/ondrej/Development/phpstan/tests/PHPStan/Analyser/NodeScopeResolverTest.php:202

Could we somehow get closer to the original output that looks like a diff? Maybe even somehow reuse PHPUnit utilities for diffing values.

Thanks!

@staabm
Copy link
Contributor

staabm commented Sep 16, 2024

btw: I find it interessting that running time vendor/bin/phpunit tests/PHPStan/Analyser/NodeScopeResolverTest.php on my windows laptop only takes 23-24 seconds - and there is no real difference before/after this PR.

$ php -v
PHP 8.2.12 (cli) (built: Oct 24 2023 21:15:35) (NTS Visual C++ 2019 x64)
Copyright (c) The PHP Group
Zend Engine v4.2.12, Copyright (c) Zend Technologies

@ondrejmirtes
Copy link
Member

@staabm That's expected and described in this PR. Only Paratest ran it twice.

@staabm
Copy link
Contributor

staabm commented Sep 16, 2024

That's expected and described in this PR. Only Paratest ran it twice.

@ondrejmirtes let me phrase it differently: maybe it would make sense to run NodeScopeResolverTest with phpunit and ignore it in paratest since even after the PR on my windows box the test is still 1-2 seconds faster when run with plain phpunit.

I have no idea this would be possible though.

@ondrejmirtes
Copy link
Member

We won't need to worry about that when this PR gets merged.

@schlndh
Copy link
Contributor Author

schlndh commented Sep 16, 2024

@ondrejmirtes Thanks for your feedback. I can adjust the error message. But IMO there is not much of a reason to try to preserve the diff output format. The types are always single-line strings, so the diff is just putting the two types below each other, instead of on one line.

So continuing with your example, I could do something like this:

1) PHPStan\Analyser\NodeScopeResolverTest::testFile with data set "tests/PHPStan/Analyser/nsrt/array-chunk.php" ('/home/schlndh/devel/custom/ph...nk.php')
Failed assertions in /home/schlndh/devel/custom/phpstan-src/tests/PHPStan/Analyser/nsrt/array-chunk.php:

Line 13:
Expected: list<non-empty-list<mixed>
Actual:   list<non-empty-list<mixed>>

Line 29:
Expected: array{array{0, 1}, array{2}
Actual:   array{array{0, 1}, array{2}}

/home/schlndh/devel/custom/phpstan-src/tests/PHPStan/Analyser/NodeScopeResolverTest.php:262

2) PHPStan\Analyser\NodeScopeResolverTest::testFile with data set "tests/PHPStan/Analyser/nsrt/array-flip.php" ('/home/schlndh/devel/custom/ph...ip.php')
Failed assertions in /home/schlndh/devel/custom/phpstan-src/tests/PHPStan/Analyser/nsrt/array-flip.php:

Line 23:
Expected: array<int|string, (int|strin)>
Actual:   array<int|string, (int|string)>

/home/schlndh/devel/custom/phpstan-src/tests/PHPStan/Analyser/NodeScopeResolverTest.php:262

IMO this is better than my current version, because the types are below each other. And it's also better than the 1.12.x version, because there's less noise from phpunit.

If you still want a diff. Then I suppose I could generate an "expected string" and "actual string" and then compare them via assertSame to get a diff for free (optionally I could also filter only lines that don't match to reduce noise):

1) PHPStan\Analyser\NodeScopeResolverTest::testFile with data set "tests/PHPStan/Analyser/nsrt/array-chunk.php" ('/home/schlndh/devel/custom/ph...nk.php')
Failed asserting that two strings are identical.
--- Expected
+++ Actual
@@ @@
-'13: list<non-empty-list<mixed>
+'13: list<non-empty-list<mixed>>
 14: list<non-empty-array>
 17: list<non-empty-list<int>>
 18: list<non-empty-array<string, int>>
 21: non-empty-list<non-empty-list<bool>>
 22: non-empty-list<non-empty-array<int|string, bool>>
-29: array{array{0, 1}, array{2}
+29: array{array{0, 1}, array{2}}
 30: array{array{a: 0, 17: 1}, array{b: 2}}
 31: array{array{0}, array{1}, array{2}}
 32: array{array{a: 0}, array{17: 1}, array{b: 2}}

/home/schlndh/devel/custom/phpstan-src/tests/PHPStan/Analyser/NodeScopeResolverTest.php:257

2) PHPStan\Analyser\NodeScopeResolverTest::testFile with data set "tests/PHPStan/Analyser/nsrt/array-flip.php" ('/home/schlndh/devel/custom/ph...ip.php')
Failed asserting that two strings are identical.
--- Expected
+++ Actual
@@ @@
 '13: array<int, (int|string)>
-23: array<int|string, (int|strin)>
+23: array<int|string, (int|string)>
 32: array<1|2|3, int>
 42: array<string, 1|2|3>
 51: non-empty-array<4|5|6, 1|2|3>

/home/schlndh/devel/custom/phpstan-src/tests/PHPStan/Analyser/NodeScopeResolverTest.php:257

Let me know what you think. Or if you want to avoid the back-and-forth (I should be able to get back to this on Thursday/Friday), you can merge it as is and adjust the output to your liking.

@schlndh
Copy link
Contributor Author

schlndh commented Sep 16, 2024

@staabm

on my windows box the test is still 1-2 seconds faster when run with plain phpunit.

I'd say that's expected. Paratest does extra stuff compared to phpunit. If you want to run a single test, just run it with phpunit. If you run many tests across multiple cores, than paratest's overhead should be worth it due to the speed up from running the tests in parallel.

Example:

time php vendor/bin/phpunit --no-coverage tests/PHPStan/Analyser/Bug9307Test.php
...
real	0m1.180s
user	0m1.085s
sys	0m0.108s
time php vendor/bin/paratest --runner WrapperRunner --no-coverage tests/PHPStan/Analyser/Bug9307Test.php
...
real	0m2.374s
user	0m2.616s
sys	0m0.292s

@ondrejmirtes
Copy link
Member

Yeah, this would be sufficient, thank you!


Line 13:
Expected: list<non-empty-list<mixed>
Actual:   list<non-empty-list<mixed>>

Line 29:
Expected: array{array{0, 1}, array{2}
Actual:   array{array{0, 1}, array{2}}

@schlndh
Copy link
Contributor Author

schlndh commented Sep 16, 2024

Done. I changed assertVariableCertainty message similarly.:

1) PHPStan\Analyser\NodeScopeResolverTest::testFile with data set "tests/PHPStan/Analyser/nsrt/falsey-isset-certainty.php" ('/home/schlndh/devel/custom/ph...ty.php')
Failed assertions in /home/schlndh/devel/custom/phpstan-src/tests/PHPStan/Analyser/nsrt/falsey-isset-certainty.php:

Certainty of variable $a on line 21:
Expected: Maybe
Actual:   Yes

/home/schlndh/devel/custom/phpstan-src/tests/PHPStan/Analyser/NodeScopeResolverTest.php:262

@ondrejmirtes
Copy link
Member

Love it, thank you!

BTW if you wanted to do some more tests-related improvements, I have ideas how to modernize RuleTestCase which badly needs it.

2nd arguments to analyse method looks like this:

[
	[
		'Missing parameter $bar (string) in call to method Bug4800\HelloWorld2::a().',
		36,
	],
],

But it could look like this instead:

$this->analyse([__DIR__ . '/data/bug-4800.php'], [
	RuleErrorBuilder::message('Missing parameter $bar (string) in call to method Bug4800\HelloWorld2::a().')
		->line(36)
		->identifier('argument.missing'),
]);

Basically the$expectedErrors could be list<RuleError|RuleErrorBuilder|array{0: string, 1: int, 2?: string|null}> and all the RuleError properties could be checked by RuleTestCase.

See RuleErrorTransformer what all *RuleError interfaces are there and what we would have to handle.

@ondrejmirtes ondrejmirtes merged commit 83bf3ab into phpstan:1.12.x Sep 16, 2024
477 of 499 checks passed
@schlndh schlndh deleted the feature-optimizeNodeScopeResolverTest branch September 16, 2024 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants