Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: contentTypeParser cache, prefer lru to fifo #5340

Closed
wants to merge 1 commit into from

Conversation

gurgunday
Copy link
Member

Why Fifo?

If there is a cache match, we should make that entry fresh again as chances of it being used again is now higher

@gurgunday gurgunday requested a review from kibertoad March 1, 2024 08:34
@mcollina
Copy link
Member

mcollina commented Mar 1, 2024

Because you changed it: #5331

@kibertoad
Copy link
Member

@gurgunday lru is only useful when you expect eviction due to exceeding the cache size, otherwise you are just paying overhead on every get for no reason.
Do we expect evictions here? From my understanding we get a fairly limited amount of entries

@gurgunday
Copy link
Member Author

gurgunday commented Mar 1, 2024

No, I indeed changed it from Fifo to FifoMap, but here I'm proposing changing FifoMap to LruMap

For this part of the code, an Lru could be better since a match would bump that contentType in cache so that it stays longer before getting evicted

@kibertoad
Copy link
Member

@gurgunday lru doesn't affect ttl, it only switches eviction order. do we expect evictions?

@gurgunday
Copy link
Member Author

@gurgunday lru doesn't affect ttl, it only switches eviction order. do we expect evictions?

Yes, when I said it would stay longer, I meant the evictions

@gurgunday lru is only useful when you expect eviction due to exceeding the cache size, otherwise you are just paying overhead on every get for no reason. Do we expect evictions here? From my understanding we get a fairly limited amount of entries

The cache can fill up fast for RegExp matches, but for string matches it actually doesn't do much, so maybe the cache should only be for RegExp, will take a look if I can come up with something better

@gurgunday gurgunday marked this pull request as draft March 1, 2024 10:06
@kibertoad
Copy link
Member

@gurgunday we can have two separate caches for these cases, lru for regex and fifo for strings accordingly, that would have optimal runtime characteristics

@gurgunday
Copy link
Member Author

Actually, after a second look, there could be evictions because of string matches too:

for (var i = 0; i !== this.parserList.length; ++i) {
const parserListItem = this.parserList[i]
if (contentType.indexOf(parserListItem) === 0) {
const parser = this.customParsers.get(parserListItem)
this.cache.set(contentType, parser)
return parser
}
}

Consider the following:

A client sends a POST request with the Content-Type header application/json --;, this will be matched to application/json and cached, another client can then send application/json *** or application/json; charset=utf-8, these will all be cached

Combined with the RegExp matches, I feel like bumping fits better, no?

Is bumping that expensive?

@kibertoad
Copy link
Member

Not super expensive, but state mutation is state mutation.
How many different permutations do you expect there? Wonder if we can bump cache size to 1000 entries (which is still not much memory) and expect not to get any evictions.

@gurgunday
Copy link
Member Author

How many different permutations do you expect there?

No idea 😁

This is a client-dependent area, in theory I could spam a server with different variations of application/json to cause evictions, but in a normal scenario, inputs should be pretty consistent

I'll think about what the best way to handle this would be

@kibertoad
Copy link
Member

We don't need to consider malicious agent scenario here, eviction event is not expensive enough to qualify as a ddos attack surface.

@gurgunday
Copy link
Member Author

Should I bench?

In any case, this is when the server receives stuff, which is much rarer than sending stuff anyway — so neither LRU or FIFO will cause any meaningful bottleneck

Looking at the nature of this cache, I think an LRU makes more sense — why not push back the eviction order of a Content-Type that was matched recently? An alternative is to increase the FIFO size, as @kibertoad said, but my problem isn't necessarily with the size

What does @fastify/core think?

@gurgunday
Copy link
Member Author

I read a few things, looked at some practical benchmarks, and concluded that a FIFO is simply better unless there is huge interest in bumping recent hits, like in rate-limit

@gurgunday gurgunday closed this Mar 9, 2024
@kibertoad
Copy link
Member

@gurgunday Do you think we may still benefit from increasing the cache size?

@gurgunday
Copy link
Member Author

100 sounds reasonable to me... but maybe it is too little – some real-world usage data would be pretty helpful here

@gurgunday gurgunday deleted the use-map branch March 9, 2024 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants