Skip to content

Commit

Permalink
Optimize usage of Intl API to speed up response parsing with many dat…
Browse files Browse the repository at this point in the history
…etime objects (#1174)

## Background
Lately, I've been upgrading a big business application that was using the latest 4.x version of the neo4j driver to use the latest 5.x version (we still use neo4j 4.x, but they are compatible [according to docs](https://neo4j.com/developer/kb/neo4j-supported-versions/#_neo4j_database_enterprise_edition_4)). The upgrade itself was very smooth, but while testing everything afterwards, we noticed that (almost) all of our requests to the backend took considerably longer to finish (~2x).

After doing some investigation (mainly by using [clinic flamegraphs](https://clinicjs.org/flame/)) I noticed that there was a considerable increase in the time spent parsing the raw neo4j responses in the driver. Looking at it in more detail revealed that most of the increase stems from one particular codepath, namely from calls to [`getTimeInZoneId`](https://github.com/neo4j/neo4j-javascript-driver/blob/5.0/packages/bolt-connection/src/bolt/bolt-protocol-v5x0.utc.transformer.js#L160).

Looking at it almost immediately revealed the culprit, which is how the `Intl` API is used there. It seems that a new `Intl.DateTimeFormat` object is created for each date time returned in the response. The `Intl` API is notoriously slow afaik, hence we should reduce the usage of those APIs to an absolute minimum in hot code paths, such as response parsing. Also, since the application I was upgrading is basically doing nothing else than managing timestamps at its core, it made sense that we noticed the performance degradation in such a severe way.

## Changes in this MR
I decided to try out to cache the `DateTimeFormat` to prevent intializing the formatter for a given time zone more than once, and it seems to have helped quite a lot (in our case the "big" requests got a speedup of 60-70%). I also checked for other usages of `Intl` in the code base, but luckily only found one other place, where it's used to check the validity of a given timezone string. I added caching there as well, though I'm not entirely sure if this is a case of premature optimization, since we personally didn't run into performance issues where this particular method was involved. I'll leave this up to you guys to decide if we should include those changes in this MR as well, or revert them.
  • Loading branch information
vongruenigen committed Jan 29, 2024
1 parent 7545b38 commit 97ff4dc
Show file tree
Hide file tree
Showing 5 changed files with 110 additions and 30 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -157,18 +157,30 @@ function getOffsetFromZoneId (timeZoneId, epochSecond, nanosecond) {
return offset
}

const dateTimeFormatCache = new Map()

function getDateTimeFormatForZoneId (timeZoneId) {
if (!dateTimeFormatCache.has(timeZoneId)) {
const formatter = new Intl.DateTimeFormat('en-US', {
timeZone: timeZoneId,
year: 'numeric',
month: 'numeric',
day: 'numeric',
hour: 'numeric',
minute: 'numeric',
second: 'numeric',
hour12: false,
era: 'narrow'
})

dateTimeFormatCache.set(timeZoneId, formatter)
}

return dateTimeFormatCache.get(timeZoneId)
}

function getTimeInZoneId (timeZoneId, epochSecond, nano) {
const formatter = new Intl.DateTimeFormat('en-US', {
timeZone: timeZoneId,
year: 'numeric',
month: 'numeric',
day: 'numeric',
hour: 'numeric',
minute: 'numeric',
second: 'numeric',
hour12: false,
era: 'narrow'
})
const formatter = getDateTimeFormatForZoneId(timeZoneId)

const utc = int(epochSecond)
.multiply(1000)
Expand Down
23 changes: 19 additions & 4 deletions packages/core/src/internal/temporal-util.ts
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
*/

import Integer, { int, isInt } from '../integer'
import { newError } from '../error'
import { Neo4jError, newError } from '../error'
import { assertNumberOrInteger } from './util'
import { NumberOrInteger } from '../graph-types'

Expand Down Expand Up @@ -428,13 +428,28 @@ export function assertValidNanosecond (
)
}

const timeZoneValidityCache = new Map<string, boolean>()
const newInvalidZoneIdError = (zoneId: string, fieldName: string): Neo4jError => newError(
`${fieldName} is expected to be a valid ZoneId but was: "${zoneId}"`
)

export function assertValidZoneId (fieldName: string, zoneId: string): void {
const cachedResult = timeZoneValidityCache.get(zoneId)

if (cachedResult === true) {
return
}

if (cachedResult === false) {
throw newInvalidZoneIdError(zoneId, fieldName)
}

try {
Intl.DateTimeFormat(undefined, { timeZone: zoneId })
timeZoneValidityCache.set(zoneId, true)
} catch (e) {
throw newError(
`${fieldName} is expected to be a valid ZoneId but was: "${zoneId}"`
)
timeZoneValidityCache.set(zoneId, false)
throw newInvalidZoneIdError(zoneId, fieldName)
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -157,18 +157,30 @@ function getOffsetFromZoneId (timeZoneId, epochSecond, nanosecond) {
return offset
}

const dateTimeFormatCache = new Map()

function getDateTimeFormatForZoneId (timeZoneId) {
if (!dateTimeFormatCache.has(timeZoneId)) {
const formatter = new Intl.DateTimeFormat('en-US', {
timeZone: timeZoneId,
year: 'numeric',
month: 'numeric',
day: 'numeric',
hour: 'numeric',
minute: 'numeric',
second: 'numeric',
hour12: false,
era: 'narrow'
})

dateTimeFormatCache.set(timeZoneId, formatter)
}

return dateTimeFormatCache.get(timeZoneId)
}

function getTimeInZoneId (timeZoneId, epochSecond, nano) {
const formatter = new Intl.DateTimeFormat('en-US', {
timeZone: timeZoneId,
year: 'numeric',
month: 'numeric',
day: 'numeric',
hour: 'numeric',
minute: 'numeric',
second: 'numeric',
hour12: false,
era: 'narrow'
})
const formatter = getDateTimeFormatForZoneId(timeZoneId)

const utc = int(epochSecond)
.multiply(1000)
Expand Down
23 changes: 19 additions & 4 deletions packages/neo4j-driver-deno/lib/core/internal/temporal-util.ts
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
*/

import Integer, { int, isInt } from '../integer.ts'
import { newError } from '../error.ts'
import { Neo4jError, newError } from '../error.ts'
import { assertNumberOrInteger } from './util.ts'
import { NumberOrInteger } from '../graph-types.ts'

Expand Down Expand Up @@ -428,13 +428,28 @@ export function assertValidNanosecond (
)
}

const timeZoneValidityCache = new Map<string, boolean>()
const newInvalidZoneIdError = (zoneId: string, fieldName: string): Neo4jError => newError(
`${fieldName} is expected to be a valid ZoneId but was: "${zoneId}"`
)

export function assertValidZoneId (fieldName: string, zoneId: string): void {
const cachedResult = timeZoneValidityCache.get(zoneId)

if (cachedResult === true) {
return
}

if (cachedResult === false) {
throw newInvalidZoneIdError(zoneId, fieldName)
}

try {
Intl.DateTimeFormat(undefined, { timeZone: zoneId })
timeZoneValidityCache.set(zoneId, true)
} catch (e) {
throw newError(
`${fieldName} is expected to be a valid ZoneId but was: "${zoneId}"`
)
timeZoneValidityCache.set(zoneId, false)
throw newInvalidZoneIdError(zoneId, fieldName)
}
}

Expand Down
26 changes: 26 additions & 0 deletions runTests.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
npm install -g gulp typescript jest

npm ci
npm run build -- --no-private

if [ -n "$2" ]; then
export NEOCTRL_ARGS="$2"
fi

trap "npm run stop-neo4j" EXIT

npm run start-neo4j

if [ $? -ne 0 ]; then
echo "Unable to start neo4j"
exit 1
fi

npm test -- --no-private

if [ $? -eq 0 ]; then
echo "Exit with code 0"
else
echo "Exit with code 1"
exit 1
fi

0 comments on commit 97ff4dc

Please sign in to comment.