Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize usage of Intl API to speed up response parsing with many datetime objects #1174

Conversation

vongruenigen
Copy link
Contributor

Background

Lately, I've been upgrading a big business application that was using the latest 4.x version of the neo4j driver to use the latest 5.x version (we still use neo4j 4.x, but they are compatible according to docs). The upgrade itself was very smooth, but while testing everything afterwards, we noticed that (almost) all of our requests to the backend took considerably longer to finish (~2x).

After doing some investigation (mainly by using clinic flamegraphs) I noticed that there was a considerable increase in the time spent parsing the raw neo4j responses in the driver. Looking at it in more detail revealed that most of the increase stems from one particular codepath, namely from calls to getTimeInZoneId.

Looking at it almost immediately revealed the culprit, which is how the Intl API is used there. It seems that a new Intl.DateTimeFormat object is created for each date time returned in the response. The Intl API is notoriously slow afaik, hence we should reduce the usage of those APIs to an absolute minimum in hot code paths, such as response parsing. Also, since the application I was upgrading is basically doing nothing else than managing timestamps at its core, it made sense that we noticed the performance degradation in such a severe way.

Changes in this MR

I decided to try out to cache the DateTimeFormat to prevent intializing the formatter for a given time zone more than once, and it seems to have helped quite a lot (in our case the "big" requests got a speedup of 60-70%). I also checked for other usages of Intl in the code base, but luckily only found one other place, where it's used to check the validity of a given timezone string. I added caching there as well, though I'm not entirely sure if this is a case of premature optimization, since we personally didn't run into performance issues where this particular method was involved. I'll leave this up to you guys to decide if we should include those changes in this MR as well, or revert them.

Remarks/questions

  • Is the Deno implementation totally seperate from the node implementation? If not I'd like to put the new helpers in some shared project to prevent code duplication.
  • I haven't added any explicit test for this refactoring/improvement, since the getTimeInZoneId method is extensively tested with the test suite already. Is this fine for you?
  • What's the release schedule for new driver versions? We're currently stuck on using our own version until this fix is merged.
  • Note the bash script I've added, which is basically a 1:1 copy of the PowerShell script at the root of the project.

Maybe there's a better way to achieve the same, and since I'm in no way an expert on the Intl API or this driver, I'm very open to suggestions and other ideas how to solve this (also because I'm a first time contributor, see below).

Looking forward to hearing from you guys, hopefully we can get this resolved as quickly as possible.

Thanks in advance! :)

CLA

I'm a first time contributor and the issue template said that I need to mention it here somewhere. I signed the CLA, let me know in case anything else is needed from my side.

@vongruenigen vongruenigen changed the title Optimize usage of Intl API to prevent performance degradation Optimize usage of Intl API to speed up response parsing with many datetime objects Jan 22, 2024
@vongruenigen
Copy link
Contributor Author

After some more digging I saw that the problem didn't occur the first time in a 5.x release, but rather in the release 4.4.7. We didn't notice until now because we had version 4.4.5 pinned in our npm package.

@bigmontz bigmontz self-requested a review January 26, 2024 10:00
@bigmontz
Copy link
Contributor

Hey @vongruenigen, good catch. I notice and register your CLA signature. The PR looks good, I will do some investigation here before get the approval and merge it.

Thanks for catching the regression and push the changes.

Copy link
Contributor

@bigmontz bigmontz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some small changes related to linting/code style and a small improvement.

Please remember to sync the deno driver code using npm run build::deno to update the deno source code.

Thanks again for the contribution.

In my local tests, I measured ~38% improvement in the velocity of parse 10000 values or so.

packages/core/src/internal/temporal-util.ts Outdated Show resolved Hide resolved
packages/core/src/internal/temporal-util.ts Outdated Show resolved Hide resolved
@vongruenigen
Copy link
Contributor Author

Thanks a lot for the feedback and suggestions @bigmontz! I'll be on holiday for a week from today on but I'll try to find some spare time to incorporate your suggestions above. Will ping you again once I'm done.

Copy link
Contributor

@bigmontz bigmontz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥇

Thanks for your contribution, @vongruenigen. This might get released in the 5.17.0 version at this week.

@bigmontz bigmontz merged commit 97ff4dc into neo4j:5.0 Jan 29, 2024
37 checks passed
@vongruenigen
Copy link
Contributor Author

@bigmontz thanks a lot for the swift review and merging. Really looking forward to the new release! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants