Allow disable ratelimit quota handler #22314

clwluvw · 2023-08-12T23:48:32Z

Is your feature request related to a problem? Please describe.
In performance benchmarks, it has been seen that rateLimitQuotaWrapping would eat a significant CPU time and brings ~0.5ms latency on each request.

Describe the solution you'd like
Like other handlers, make it able to drop rateLimitQuotaWrapping at all.

Describe alternatives you've considered
Drop the effect of the wrapper when there is no rate limiter in place.

Additional context

The text was updated successfully, but these errors were encountered:

biazmoreira · 2023-08-31T13:19:06Z

@clwluvw which backends you made requests to in your performance benchmark? We are currently fixing a bug related to quotas increasing requests latency, but this only happens for certain backends and certain types of quotas. I'm interested in hearing more about which test cases you've run.

clwluvw · 2023-08-31T13:26:25Z

I'm not sure what I see in the wrapper code relates to a specific backend but my tests were on raft and etcd backends.
@biazmoreira Can you point me to the PR you think might fix it?

biazmoreira · 2023-08-31T13:32:59Z

@clwluvw sorry, I meant which auth method you were using to issue requests to Vault.

And the PR: #22597 and #22620 and #22583

We think those will fix the latency issues you were noticing.

mpalmi · 2023-08-31T13:37:58Z

Note that with the fixes above, you'll still incur 1xRTT latency for logins when using auth methods that implement the ResolveRoleOperation.

There's one more fix in flight which should allow you to get parity with pre-1.12 version latency for such cases (unless you have a role-based quota applied to the namespace/mount) by setting a config option:

#22651

clwluvw · 2023-08-31T13:44:09Z

@biazmoreira I used token-based auth (basically the root token) for my requests. Do you still think those PRs would resolve the CPU time I saw in the wrapper?
I didn't have any rate-limit config in place so I'm surprised that ~0.5ms latency is added to the requests because of a wrapper that is going to do nothing in the end...
From what I see in the code there were many flows taken until it gets there was no rate limit configured so the wrapper could be ended.
Probably introducing the config I proposed in my PR could help in those situations where the setup is not expecting to have any rate limit and is willing to gain microseconds of latency.

biazmoreira · 2023-08-31T13:50:42Z

@clwluvw Instead of bypassing the wrapper altogether, we pinpointed what in the wrapper was causing the increase in latency and decided to fix it. You can check that https://github.com/hashicorp/vault/pull/22597/files modifies the wrapper you mentioned in the PR description. We think this will fix the issue, but as Mike mentioned, it might happen for cases in which role-based quotas exist and the auth method implements ResolveRoleOperation.

biazmoreira · 2023-08-31T13:54:02Z

Also, can you give us more details about the requests you used for the benchmark? Which auth methods, APIs, secret engines, etc?

clwluvw · 2023-08-31T14:08:50Z

I used the root token for authentication and transit engine (https://developer.hashicorp.com/vault/api-docs/secret/transit#generate-data-key) to generate the data key. The latency for key generation was in microseconds but with the rate limit wrapper, I always get more than 1ms... After disabling that (and building the source again) I could gain sub-ms latency from the API. The CPU profiler was showing that...
With this PR (#22597) Can you help me to understand now:

Are we expecting a DB query on every request to see if the path has a role-based quota (

vault/vault/quotas/quotas.go

Line 640 in 3e55447

quota, err := txn.First(qType, indexNamespaceMount, req.NamespacePath, req.MountPath, false, true)

)?
Are we expecting the wrapper to be finished here (https://github.com/hashicorp/vault/blob/main/vault/quotas/quotas.go#L789) if no rate-limit is in place?

biazmoreira · 2023-09-01T12:08:43Z

Are we expecting a DB query on every request to see if the path has a role-based quota

No, we make a query in-memory for login requests specifically to see if the path has a role-based quota.

biazmoreira · 2023-09-01T12:59:30Z

Thanks for all the context, @clwluvw. We have just merged the last change that we think could fix the issue in latency you're seeing. Would you be able to test it once more with the latest commit from main?

clwluvw · 2023-09-01T13:46:02Z

@biazmoreira I built from (e1895d8) and in addition, I put some time measurement over the rateLimitQuotaWrapping func (before calling handler.ServeHTTP(w, r)):

diff --git a/http/util.go b/http/util.go
index 19bdbf836c..f2be83350f 100644
--- a/http/util.go
+++ b/http/util.go
@@ -12,6 +12,7 @@ import (
 	"net"
 	"net/http"
 	"strings"
+	"time"
 
 	"github.com/hashicorp/vault/sdk/logical"
 
@@ -40,6 +41,7 @@ var (
 
 func rateLimitQuotaWrapping(handler http.Handler, core *vault.Core) http.Handler {
 	return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		startTime := time.Now()
 		ns, err := namespace.FromContext(r.Context())
 		if err != nil {
 			respondError(w, http.StatusInternalServerError, err)
@@ -127,8 +129,8 @@ func rateLimitQuotaWrapping(handler http.Handler, core *vault.Core) http.Handler
 			return
 		}
 
+		core.Logger().Info("ratelimit time", "spend", time.Since(startTime).Seconds())
 		handler.ServeHTTP(w, r)
-		return
 	})
 }

and here are the sample times (in seconds):

2023-09-01T13:40:16.649Z [INFO]  core: ratelimit time: spend=0.00050588
2023-09-01T13:40:17.917Z [INFO]  core: ratelimit time: spend=0.000561915
2023-09-01T13:40:18.924Z [INFO]  core: ratelimit time: spend=0.000503233
2023-09-01T13:40:19.437Z [INFO]  core: ratelimit time: spend=0.000239324
2023-09-01T13:40:19.788Z [INFO]  core: ratelimit time: spend=0.00051691
2023-09-01T13:40:20.496Z [INFO]  core: ratelimit time: spend=0.000536714
2023-09-01T13:40:21.295Z [INFO]  core: ratelimit time: spend=0.000484667
2023-09-01T13:40:22.093Z [INFO]  core: ratelimit time: spend=0.000508987
2023-09-01T13:40:22.820Z [INFO]  core: ratelimit time: spend=0.000500172
2023-09-01T13:40:23.675Z [INFO]  core: ratelimit time: spend=0.000493217

Still, ~0.5ms is consumed for the wrapper.

clwluvw · 2023-09-07T16:13:42Z

@biazmoreira Do you have any thoughts on this? Would the PR I linked to this issue be good for such use cases?

biazmoreira · 2023-09-14T13:45:11Z

Hey @clwluvw, we would like to understand your issue more in depth. Would you be able to give us more details about the use case in which this problem appears? Was this meant to be only a performance benchmark or are you facing this issue in production every time you issue a request? Would you be able to give us more insight on your use case?

Unfortunately, we won't be moving forward with your suggestion implemented in the PR. We would like to understand your use case a bit more before we think of other solutions to this specific case.

One suggestion, though, is to configure exempt_rate_limit_paths when configuring rate limits. Be aware that this configuration option overrides the default values and only take absolute paths.

Looking forward to hearing more from you about the use case. Let me know if there's anything else I can be of help!

clwluvw · 2023-09-14T15:43:15Z

@biazmoreira The issue is on a default vault installation (without any rate limit configuration - authenticate with root token - no matter what the API is (here in my case was the one I mentioned above)) - the rate limit handler always consumes 0.5ms time. This would become a problem on installations that microseconds matter. I mean sometimes you cannot close your eyes on 0.5ms time consumption...

So now to reproduce that you can apply the diff I paste for you - bring up a vault instance (no HA - backend on memory - no rate limit) - issue a request to one of the APIs (probably the one I mentioned would be better) - and see the output. In my case, I see 0.5ms consumed by the wrapper while I haven't configured any rate limits anywhere.
So the question I got is why the latency for the request I issued should always be +0.5ms (i.e. I'm looking to reduce the effect of the rate limit wrapper because I haven't configured any rate limits and will not as well).

Can't we have the wrapper with less microseconds effect? or make it configurable to disable it (as there is already one disabling option for one of the wrappers)?

akshya96 · 2023-09-18T20:19:56Z

We acknowledge your concerns and share the belief that we should minimize unneeded latency, especially in request hot paths. However, there are instances where substantial latency increases cannot be entirely avoided. In practice, our preference is to refrain from adding configuration options to disable features solely due to latency concerns, as it may set an undesirable precedent. Imagine if we were to follow this approach for every feature that introduces latency. One potential solution to address your latency issues without resorting to feature toggles is to explore the option of using rate_limit_exempt_paths. Have you considered this as an alternative to your proposed solution?
While the benchmarks provided in the issue demonstrate a hypothetical latency increase, it’s important to note that actual latency can vary significantly based on performance characteristics and the load within the environment. At present, due to our team’s roadmap and resource constraints, we are unable to allocate resources to address this issue. However, if a compelling real-world example arises where this problem significantly impacts users, we would be willing to adjust our priorities accordingly. Until then, we may need to defer further action on this matter, but rest assured that we are open to revisiting it should the need arise.

hsimon-hashicorp · 2024-03-21T21:46:25Z

As per @akshya96's reply and our engineering methodologies, I'm going to go ahead and close this issue at this time. If needed, please feel free to re-open it.

clwluvw mentioned this issue Aug 12, 2023

handler: allow to disable ratelimit quota #22315

Closed

biazmoreira added the core/quota label Aug 31, 2023

biazmoreira self-assigned this Aug 31, 2023

hsimon-hashicorp added performance enhancement labels Mar 21, 2024

hsimon-hashicorp closed this as completed Mar 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow disable ratelimit quota handler #22314

Allow disable ratelimit quota handler #22314

clwluvw commented Aug 12, 2023

biazmoreira commented Aug 31, 2023

clwluvw commented Aug 31, 2023

biazmoreira commented Aug 31, 2023

mpalmi commented Aug 31, 2023 •

edited

clwluvw commented Aug 31, 2023

biazmoreira commented Aug 31, 2023

biazmoreira commented Aug 31, 2023

clwluvw commented Aug 31, 2023 •

edited

biazmoreira commented Sep 1, 2023

biazmoreira commented Sep 1, 2023

clwluvw commented Sep 1, 2023 •

edited

clwluvw commented Sep 7, 2023

biazmoreira commented Sep 14, 2023 •

edited

clwluvw commented Sep 14, 2023 •

edited

akshya96 commented Sep 18, 2023

hsimon-hashicorp commented Mar 21, 2024

Allow disable ratelimit quota handler #22314

Allow disable ratelimit quota handler #22314

Comments

clwluvw commented Aug 12, 2023

biazmoreira commented Aug 31, 2023

clwluvw commented Aug 31, 2023

biazmoreira commented Aug 31, 2023

mpalmi commented Aug 31, 2023 • edited

clwluvw commented Aug 31, 2023

biazmoreira commented Aug 31, 2023

biazmoreira commented Aug 31, 2023

clwluvw commented Aug 31, 2023 • edited

biazmoreira commented Sep 1, 2023

biazmoreira commented Sep 1, 2023

clwluvw commented Sep 1, 2023 • edited

clwluvw commented Sep 7, 2023

biazmoreira commented Sep 14, 2023 • edited

clwluvw commented Sep 14, 2023 • edited

akshya96 commented Sep 18, 2023

hsimon-hashicorp commented Mar 21, 2024

mpalmi commented Aug 31, 2023 •

edited

clwluvw commented Aug 31, 2023 •

edited

clwluvw commented Sep 1, 2023 •

edited

biazmoreira commented Sep 14, 2023 •

edited

clwluvw commented Sep 14, 2023 •

edited