Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[performance] Between 13.6 and 17.3 seconds to analyse an abstract class with 90 lines of code #9498

Closed
gnutix opened this issue Jun 23, 2023 · 22 comments

Comments

@gnutix
Copy link

gnutix commented Jun 23, 2023

Bug report

Hello there,

While reading @staabm's blog posts about improving performance in PHPStan, I tried running PHPStan using this script in our project to see which files take the most time to analyze. There were ~10 files that took 1-2 secondes (which is okay, they are quite complex), lots that took less than 1 sec, and... one that took 17.30s alone.

I was surprised to find out that it was "only" an abstract class of 90 lines of code. I then ran the analysis 4 more times, and got the following results: 17.30, 13.67, 13.61, 16.09, 14.05. In comparison, even the most complicated file from our legacy folder with 2300 lines of crappy code takes "only" 11.69 seconds.

So "I think there's more to this class than meets the eye", as Gandalf would say... ;)

Here it is :

<?php

declare(strict_types=1);

namespace App\Infrastructure\Shared\Notifier\Notification;

use App\Domain\Shared\Event\Event;
use App\Infrastructure\Shared\Notifier\Recipient\EmailRecipient;
use App\Infrastructure\Shared\Notifier\Recipient\InternalMessageRecipient;
use App\Infrastructure\Shared\Notifier\Recipient\RecipientFactory;
use Carbon\Carbon;
use DateTime;
use Gammadia\Moment\Timezone;
use Symfony\Component\DependencyInjection\Attribute\Autoconfigure;
use Symfony\Component\Notifier\Notification\Notification as BaseNotification;
use Symfony\Component\Notifier\NotifierInterface;
use Symfony\Component\Notifier\Recipient\RecipientInterface;
use Symfony\Contracts\Service\ResetInterface;

#[Autoconfigure(tags: [Notification::class])]
abstract class Notification extends BaseNotification implements ResetInterface
{
    protected ?Event $event = null;

    public function __construct(
        private NotifierInterface $notifier,
        protected RecipientFactory $recipientFactory,
    ) {
        parent::__construct();

        $recipientFactory->setNotification($this);
    }

    public function __invoke(Event $event): void
    {
        $this->event = $event;

        try {
            foreach ($this->recipients() as $recipient) {
                $this->notifier->send($this, $recipient);
            }
        } finally {
            $this->reset();
        }
    }

    /**
     * @return string[]
     */
    final public function getChannels(RecipientInterface $recipient): array
    {
        $channels = [];
        if ($recipient instanceof EmailRecipient) {
            $channels[] = 'email';
        }

        if ($recipient instanceof InternalMessageRecipient) {
            $channels[] = 'internal-message';
        }

        return $channels;
    }

    public function reset(): void
    {
        $this->event = null;
    }

    /**
     * Returns a list of recipients.
     *
     * Examples:
     *
     *     yield new Recipient();
     *     yield from $this->recipientFactory->createFromPersons(...$ids);
     *
     * @see RecipientFactory
     *
     * @return RecipientInterface[]|iterable<RecipientInterface>
     */
    abstract protected function recipients(): iterable;

    protected function formatLocalizedDate(DateTime $dateTime, string $locale, string $format = 'LLLL'): string
    {
        /** @var Carbon $carbon */
        $carbon = Carbon::instance($dateTime)->timezone(Timezone::getDefault())->locale($locale);

        return $carbon->isoFormat($format);
    }
}

I hope sharing this may help you guys find some bottleneck in PHPStan (or a clue to fix some problematic typing in this class), so that we can shave off theses seconds from all our CI builds!

gnutix

PS: There are also two legacy files in our project that probably take longer than they should, but I can't share them publicly. One of them is a service class that takes ~5.41 sec for 273 lines of code, the other an old PHP file that processes some values to prepare a template's rendering that takes ~4.46 sec for 339 lines. Please let me know if you'd like to investigate those too, I could send them in private.

Code snippet that reproduces the problem

No response

Expected output

Faster analysis.

Did PHPStan help you today? Did it make you happy in any way?

I wanted to thank you for taking care of the planet by taking performance issues seriously. <3

@mergeable
Copy link

mergeable bot commented Jun 23, 2023

This bug report is missing a link to reproduction at phpstan.org/try.

It will most likely be closed after manual review.

@ondrejmirtes
Copy link
Member

It's caused by Carbon and the huge PHPDoc they have there. Try to remove references to Carbon and it's going to be fast.

@gnutix
Copy link
Author

gnutix commented Jun 23, 2023

It's as I feared.

I don't think I can easily remove it, as we're dependent on the 'LLLL' formatting which I think only Carbon provides.
But I'll have a look. Thanks for the hint ! 🙏

@gnutix
Copy link
Author

gnutix commented Jun 23, 2023

Wow, that's crazy. I mentioned "one of the legacy service class takes ~5.41 sec for 273 lines of code". It just so happens that there was a reference to Carbon in that file, used only in a @param PHPDoc. So I tried to simply remove it, and... now it takes only 0.58 sec to analyze. 😱

diff --git a/legacy/.../SomeClass.php b/legacy/.../SomeClass.php
index 235bfe83b1..d34dbf7f38 100644
--- a/legacy/.../SomeClass.php
+++ b/legacy/.../SomeClass.php
@@ -2,7 +2,6 @@
 
 namespace legacy\SomeNamespace;
 
-use Carbon\Carbon;
 use DateTime;
 use DateTimeImmutable;
 use DateTimeZone;
@@ -86,8 +85,8 @@ public function someMethod($dateStart, $dateEnd, $timeZone
     }
 
     /**
-     * @param Carbon|DateTime|DateTimeImmutable $dateStart
-     * @param Carbon|DateTime|DateTimeImmutable|null $dateEnd
+     * @param DateTime|DateTimeImmutable $dateStart
+     * @param DateTime|DateTimeImmutable|null $dateEnd
      * @param string $timeZone
      *
      * @return array

So for each reference to Carbon in any PHP project, 5+ seconds are lost ?! Wouldn't there be a way to circumvent that ?

@staabm
Copy link
Contributor

staabm commented Jun 23, 2023

I don't think I can easily remove it, as we're dependent on the 'LLLL' formatting which I think only Carbon provides.
But I'll have a look. Thanks for the hint !

maybe it helps to centralize references to carbon into a few classes, so you don't have to spread it across the app.

So for each reference to Carbon in any PHP project, 5+ seconds are lost ?! Wouldn't there be a way to circumvent that ?

there are already a few open issues regarding this similar topic. its not that easy to solve.

@mad-briller
Copy link
Contributor

i ran into this same issue in one of my apps in the exact same way you did funnily enough; reading @staabm's blog and trying it myself

from my investigation the first file phpstan analyzes that contains carbon takes a while, but subsequent calls take significantly less time (probably due to internal caching)

i found that by removing carbon references in one class, the next class to be analyzed that contained carbon saw the overhead instead, so i don't think it's 5 seconds for each reference to carbon, just the first time it's referenced it takes a while

from brief profiling of the issue, it was like ~25% of time analyzing the affected file was spent parsing phpdocs alone and i couldn't find any meaningful speed up, but i'm relatively new to profilers and php performance etc. so i'd easily overlook something

@gnutix
Copy link
Author

gnutix commented Jun 23, 2023

Thanks for your answers! We analyse the legacy files and the new files separately in our project, so that's why the "Carbon cost" came twice. Now I've managed to remove this dependency entirely from our legacy code, so that's half the battle won. :)

I'm wondering : how is it we don't have Attributes for PHPDocs ? I seem to remember reading here and there that parsing PHPDocs is way slower than reading native attributes, right ?

@AJenbo
Copy link
Contributor

AJenbo commented Jul 7, 2023

Would it be possible to cache the parsed data? Since Carbon is coming from the vendor folder it could be assumed to be unchanged if the composer.lock is unchanged.

The project I'm working on sees up to 15 minutes spent on analyzing the code and PHPStorm is often complaining because even single files take over a minute (which is it's ttl for external processes).
I'm thinking this is also a big issue in general for Laravel applications as there Facades have big blocks of PHPDoc to advertise all the functions that they are able to call via __call().

@pimjansen
Copy link

I have the same but remove Carbon is not really an option though since it does more then the DateTime instance of PHP does. I have a basic class which takes 10s to analyse:

<?php

declare(strict_types=1);

namespace History\Service;

use Carbon\Carbon;

final class HistoryService
{
    public function __construct()
    {
        (new Carbon())->format('d-m-Y H:i');
    }
}

or:
https://phpstan.org/r/c36b97fa-bc9b-4347-8075-48e05d4f0628

@AJenbo
Copy link
Contributor

AJenbo commented Aug 9, 2023

Yes it is a know issue.

If it's critical to you and you can't work on phpstan it self you could make a fork/dummy version of Carbon where instead of a docblock it has concrete implementations.
Then swap it in for the real carbon during analysis. The methods could either call parents or be simple no-op depending on your intended use for this.

@ondrejmirtes
Copy link
Member

I don't think it's that big of a deal. It takes 5-10s the first time Carbon is encountered but after that the analysis is fast. So it doesn't slow down analysis percent-wise, just a constant of 5-10s.

@AJenbo
Copy link
Contributor

AJenbo commented Aug 9, 2023

For the projects I work on (Laravel based) an average PR takes about always 2-3 minutes for PHPStan. Without cache it's about 9-12 minutes. If we take out Carbon and a hand full of Facades we are looking at about 30 sec without cache.

Even analyzing a single file locally will often take about 30 sec.

I get frequent complaints about it from our developers. If there is a concrete task for how this could be improved I would be happy to work on that.

@ondrejmirtes
Copy link
Member

@AJenbo This is weird. It sounds like the work is done multiple times even if it should be done just a single time (or once per thread). Would be nice to replicate this problem so that we can investigate.

@AJenbo
Copy link
Contributor

AJenbo commented Aug 9, 2023

It feels like it's once per thread, there are a few distinct places where the progress bar clearly freezes for a while. Afaik the GitHub Workers only have two threads. But also it doesn't scale particularly well since each thread is hit by the cost, from my testing we sort of hit a wall once we got to about 1-2 minutes (without caching) with 8 cores (tested all the way to 32).

Note that Carbon contain multiple classes that have this issue so the more of Carbon is utilized by a project the worse this issue gets. I think Laravel project are also hit extra hard by this since they have there Facades which often rely on large phpdoc blocks for there type definitions.

@AJenbo
Copy link
Contributor

AJenbo commented Aug 9, 2023

@ondrejmirtes ok the total time of 9-12 minutes for single threaded analysis is actually mainly caused by the Bladestan extension, where the minimal time of 2 minutes is mainly caused by the slow phpdoc parsing afaik. It's a bit tricky since one is causing a general slow down but the other prevents us from scaling our way out of it.

I guess I will start by looking at improving Bladestan performance first since I have a decent understanding of it and it's probably the majority of the slow down until we start getting close to a couple of minutes to analyze a project.

@staabm
Copy link
Contributor

staabm commented Aug 9, 2023

@AJenbo if you could create a repo with a reproducer, that would be awesome

@ondrejmirtes
Copy link
Member

Solved it: phpstan/phpstan-src#2582

Please test this in PHPStan 1.10.32 once I release this :) Or right now in "phpstan/phpstan": "1.10.x-dev".

@AJenbo
Copy link
Contributor

AJenbo commented Aug 25, 2023

@ondrejmirtes ❤️
Thank you, this is indeed a big boost to performance for the project I work on.

Project:
1.10.28: 3m9.507s
1.10.32: 2m33.494s

Single file:
1.10.28: 0m40.934s
1.10.32: 0m32.024s

Without Bladestan

Project:
1.10.28: 0m56.168s
1.10.32: 0m16.172s

Single file:
1.10.28: 0m13.250s
1.10.32: 0m4.693s

@staabm
Copy link
Contributor

staabm commented Aug 25, 2023

@AJenbo if you could create a repro for a file which bladestan is slow on I can look into it

@devnix
Copy link

devnix commented Aug 25, 2023

Not sure if useful but just for the record, I've launched phpstan --debug in a big legacy codebase that uses Carbon too:

  • 1.10.26

    Executed in  367,95 secs    fish           external
       usr time  361,87 secs  398,00 micros  361,87 secs
       sys time    2,55 secs  119,00 micros    2,55 secs
    
  • 1.10.32

    Executed in  386,39 secs    fish           external
       usr time  376,92 secs  427,00 micros  376,92 secs
       sys time    4,55 secs  247,00 micros    4,55 secs
    

Edit: updated values

@devnix
Copy link

devnix commented Aug 25, 2023

Another update:

The numbers in my previous post are not so relevant now, maybe the difference is more about a i/o fluctuation, not sure how to empirically test it.

What I can say is that running vendor/bin/phpstan using cache, after a previous run without any new changes, the time came from consistent ~7s to consistent ~4s

@github-actions
Copy link

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants