Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GZIP - disable/set level because performance impact is bad #9905

Closed
kkmuffme opened this issue Jun 12, 2023 · 14 comments · Fixed by #9924
Closed

GZIP - disable/set level because performance impact is bad #9905

kkmuffme opened this issue Jun 12, 2023 · 14 comments · Fixed by #9924

Comments

@kkmuffme
Copy link
Contributor

#9889 added between 10-15% of runtime for uncached runs and increases memory consumption for cached runs by ~20%.
SSD space is super cheap compared to RAM and machine/user time.

Please disable GZIP by default and make it optional (perhaps allow to set the compression level), if you really want to use it. But unless you are really space limited, it doesn't really make sense to enable it @orklah

@psalm-github-bot
Copy link

Hey @kkmuffme, can you reproduce the issue on https://psalm.dev ?

@orklah
Copy link
Collaborator

orklah commented Jun 12, 2023

Can you give us the exact numbers for you instead of percents?

cc @dkarlovi

@dkarlovi
Copy link
Contributor

it doesn't really make sense to enable it

For an almost 90% reduction saying it "doesn't make sense" is doubtful. I'm open to having it have a config, there's already a property in Config which can get exposed via schema pretty easily.

From my POV, the default should be on since the reduction is substantial vs the increase in uncached memory / runtime

The difference in cached runtime / memory is larger, but as @orklah mentioned, the absolute numbers there are small either way. I'll leave it to the project leads to decide about the default.

While disk is cheap, it's still not free and there's cases where you still want this on, for example Gitlab CI allows you to store your cache dir, if it's smaller, the pipeline is faster both restoring and storing it after the run.

@dkarlovi
Copy link
Contributor

Ah right, I forgot I actually profiled the cached run today 😃

It's slower because FileReferenceProvider will update the cache even if it's up-to-date, meaning it always deflates and writes to disk.

If we really care about this specific metric, we can make it no-op in that specific case to not have the "huge increase" in cached runs.

/cc @orklah

@kkmuffme
Copy link
Contributor Author

Can you give us the exact numbers for you instead of percents?

Yes, but they're not very useful since it's from a CI pipeline: 15% more runtime = 15% more cost or 15% longer wait time in queue.

Practically, on the projects we run, it's between 1-5 seconds from what I saw with a couple tests for uncached runs (which are a significant amount of runs, unlike what you have when you run psalm on your personal machine only).

The cached runs are on 0.5-2 seconds slower (cache directory about 980MB without gzip)

This doesn't sound like much at first, however practically this means that you're suddenly have to wait an extra 15 minutes when you process only a few 100 psalm instances simultaneously.

For an almost 90% reduction

Space is never the bottleneck in CI/DevOps nowadays. It's always RAM/CPU and time.

It's slower because FileReferenceProvider will update the cache even if it's up-to-date, meaning it always deflates and writes to disk.

That's unrelated, but I think a very good point we should improve on too if that's the case 👍

@kkmuffme
Copy link
Contributor Author

Maybe when we're at it: instead of GZIP also add support for lz4, since it's somewhat commonly available and 7-10 times faster than gz/zlib while the cache directory will still be 50-60% smaller.

@kkmuffme
Copy link
Contributor Author

And since it's been reworked nicely already :-) we could implement #8727 now too. I guess you would prefer to just ignore the errors instead of throwing @dkarlovi, right?

@dkarlovi
Copy link
Contributor

Space is never the bottleneck in CI/DevOps nowadays. It's always RAM/CPU and time.

That's incorrect, I gave you an example where you want to conserve space to get performance: Gitlab CI cache. While disk isn't the most important metric, it shouldn't be totally disregard.

@dkarlovi
Copy link
Contributor

dkarlovi commented Jun 12, 2023

For example, PHPStan checking the same project as Psalm uses 16MB of cache vs Psalm's 400MB. While we can't directly do math with these numbers, they should give us an idea of scale here.

See #9905 (comment)

@kkmuffme
Copy link
Contributor Author

I agree, we can't totally disregard it, but in terms of business case the tradeoff isn't great. I think we should give people the option to fit their needs.

  • config option to set the GZIP level or disable it
  • maybe: add support for lz4_compress/_uncompress (since that would be literally 10 lines of additional code)
  • fix: because FileReferenceProvider will update the cache even if it's up-to-date, meaning it always deflates and writes to disk.
  • fix: cache fails to write (followup to https://github.com/vimeo/psalm/pull/8727/files) to already avoid potential time wasted when reading invalid cache files - would you prefer to just ignore it silently or to throw? @dkarlovi

Comparing PHPStand cache size without comparing runtime and amount of checks done is pointless. A Ford uses probably 90% less gas than a Ferrari; both are cars :-)

@dkarlovi
Copy link
Contributor

From my POV, the 5% increase in uncached runtime I saw in the PR is acceptable for the huge reduction in cache dir size we get. If you feel Psalm needs all these configs to fine tune the caching process, go for it, the important part is I get (or get to enable) the current behaviour.

About config, Psalm IMO already has way too many configuration options and the UX suffers quite a bit by it, adding several options just to control how exactly your cache will be compressed or not is way way beyond what I think is useful or would be used by more than a handful of people. Defaults matter and changing options should change a whole set of behaviour, because those options also have their own defaults.

@kkmuffme
Copy link
Contributor Author

It's not a 5% increase, it's a 15% increase.
For most users speed matters more than cache size, as the amount of github issues for performance/slow outnumber any issues for cache directory size by at least 10:1 - I suggest the default should be "off".

@dkarlovi
Copy link
Contributor

dkarlovi commented Jun 13, 2023

OK, just to redo the benchmark done in the PR, I've switched back to my branch, shut down everything else and ran each try 3x to get an average, here's the runs

Gzip off, cache empty
Checks took 18.69 seconds and used 808.914MB of memory
Checks took 19.37 seconds and used 808.914MB of memory
Checks took 18.80 seconds and used 808.914MB of memory

Gzip off, cache full
Checks took 0.16 seconds and used 66.542MB of memory
Checks took 0.15 seconds and used 66.195MB of memory
Checks took 0.16 seconds and used 66.195MB of memory

Gzip on, cache empty
Checks took 20.33 seconds and used 808.915MB of memory
Checks took 20.55 seconds and used 808.915MB of memory
Checks took 20.11 seconds and used 808.914MB of memory

Gzip on, cache full
Checks took 0.66 seconds and used 74.370MB of memory
Checks took 0.64 seconds and used 74.023MB of memory
Checks took 0.65 seconds and used 74.023MB of memory

Analysis

Memory

  • memory usage is almost identical across same run types (it's stable)
  • with empty cache, Gzip does not increase memory usage
  • with full cache, Gzip increases memory usage by 10%-15% which sounds bad, but then you see the absolute difference is below 8MB, which is irrelevant (it's <1% of the empty cache memory usage) [1]

Runtime

  • runtime is almost identical across same run types (it's stable)
  • with empty cache, best run Gzip off is 18.69, Gzip on is 20.11, the difference is ~7% (not 5% like I've said, but pretty close)
  • with full cache, the difference is huge, but again the absolute number is tiny [1]

[1] As noted before, the major reason why cache on is seeing this type of difference is because FileReferenceProvider always stores the cache (even when fresh), making the Gzipping of 7 files the only thing that happens in the run, we should fix that if we care about this metric.

Since we can disregard the "cache full" runs because of the stated reason above and the memory doesn't change with Gzip on in "cache empty" runs, we're only talking about the 7% runtime increase here.

Since the gain in disk space is almost 90% for 7% runtime, my instinct would be to have Gzip on be the default, as it currently is. Granted, 7% might be huge if the absolute number is large, but the number shouldn't be large since you're using cache, right? Users not using (or being unable to use) cache are the real issue, default of increase 7% of once in a lifecycle run is a good default IMO because either of the "cache full" runs dwarfs the "cache empty" run by a super wide margin, that's the number we want in the majority of runs.

In #9899 we're discussing cache vs memory usage and @orklah claims he can't use cache at all because Psalm never ends. IMO, that's the first thing we should be discussing, not which algorithm to (not) use to (not) compress your cache folder. Cache should always be usable and it not being so is a HUGE issue, the first one to fix. Micromanaging caching options is not something I'm remotely interested in.

@dkarlovi
Copy link
Contributor

dkarlovi commented Jun 13, 2023

Comparing PHPStand cache size without comparing runtime and amount of checks done is pointless. A Ford uses probably 90% less gas than a Ferrari; both are cars :-)

That's incorrect: the tools are doing roughly the same thing more or less, in roughly the same runtime and memory, even using the same underlying library (PHP Parser). One of them using 16MB while the other is using 400MB to do basically the same task is a red flag.

Actual PHPStan numbers

$ time vendor/bin/phpstan analyze src tests examples --memory-limit=512M --level max --quiet

real	0m26,907s
user	4m43,405s
sys	0m5,444s
$ time vendor/bin/phpstan analyze src tests examples --memory-limit=512M --level max --quiet

real	0m0,768s
user	0m0,625s
sys	0m0,140s
$ du -sh /tmp/phpstan/cache/
13M	/tmp/phpstan/cache/

So it the runtime is comparable, memory usage is actually significantly lower (since it can do it in 512MB or less, Psalm uses 800MB of RAM), and the disk space used is 30 times smaller.

BTW PHPStan finds a bunch of issues in Psalm's codebase:

 [ERROR] Found 1587 errors

kkmuffme added a commit to kkmuffme/psalm that referenced this issue Jun 13, 2023
Fix vimeo#9905
Misc improvements for cache
kkmuffme added a commit to kkmuffme/psalm that referenced this issue Jun 13, 2023
Fix vimeo#9905
Misc improvements for cache

(cherry picked from commit 2303153)
kkmuffme added a commit to kkmuffme/psalm that referenced this issue Jun 13, 2023
kkmuffme added a commit to kkmuffme/psalm that referenced this issue Jun 13, 2023
kkmuffme added a commit to kkmuffme/psalm that referenced this issue Jun 18, 2023
Fix vimeo#9905
Suppress throws for igbinary_unserialize like it was done for @unserialize
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants