-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support true global reconcile rate limiting #294
Conversation
This PR tweaks how ratelimiters are applied to support _actual_ global reconcile rate limiting - that is all reconcile triggers are rate limited, not just some. See crossplane/crossplane#2595 for details. Signed-off-by: Nic Cope <negz@rk0n.org>
} | ||
r.limit.Forget(item) | ||
return r.inner.Reconcile(ctx, req) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two downsides to this approach:
- It rate limits work queue pops, not pushes. There's an increased chance of unbounded queue growth.
- It artificially inflates the
controller_runtime_reconcile_total
metric, since some 'reconciles' are rate limited and sent immediately back to the work queue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW in practice I haven't seen this being an issue during my scale testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@negz these two downsides seem mostly unavoidable to me unless we want to change a fair amount of the machinery we are consuming from controller-runtime
. That being said, it may be worth thinking about exposing custom metrics for "crossplane controllers" that are more informative. No action required in this PR, and I know you are already working on some of these things, but just thought I would mention while on my mind :)
@@ -63,6 +63,6 @@ type Options struct { | |||
func (o Options) ForControllerRuntime() controller.Options { | |||
return controller.Options{ | |||
MaxConcurrentReconciles: o.MaxConcurrentReconciles, | |||
RateLimiter: ratelimiter.NewController(o.GlobalRateLimiter), | |||
RateLimiter: ratelimiter.NewController(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is each controller uses NewController
to produce a per-item exponential backoff controller to rate limit reconciles that return Requeue: true
or a non-nil error
.
The NewGlobal
reconciler is passed to ratelimiter.NewReconciler
to rate limit all reconciles.
name string | ||
inner reconcile.Reconciler | ||
limit ratelimiter.RateLimiter | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could see a pretty good case for this being in pkg/reconciler
package instead. We have a practice of using subpackages there to avoid collisions so it could either be ratelimited.Reconciler
in pkg/reconciler/ratelimited
or we could make pkg/reconciler
a package and have it be reconciler.RateLimited
. No strong feelings really - here feels pretty good too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally am a fan of having it here as I consider it as rate limiter that happens to be implemented as a reconciler 👍🏻
Signed-off-by: Nic Cope <negz@rk0n.org>
99a8505
to
f7ed086
Compare
Signed-off-by: Nic Cope <negz@rk0n.org>
@muvaf @hasheddan ping. Please don't review this on the weekend, but I would really like to get unblocked here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this improvement @negz and apologies for the long delay in review! A few notes below, but LGTM!
name string | ||
inner reconcile.Reconciler | ||
limit ratelimiter.RateLimiter | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally am a fan of having it here as I consider it as rate limiter that happens to be implemented as a reconciler 👍🏻
// for handling requests that have not been and will not be rate limited without | ||
// blocking. | ||
func (r *Reconciler) when(req reconcile.Request) time.Duration { | ||
item := r.name + req.String() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: just noting that in the only place we call when()
above, we have already generated item
so it could either be passed or we could opt to move it to after the when()
call in the caller:
https://github.com/crossplane/crossplane-runtime/pull/294/files#diff-6a81a551d3f3b9bd6b0060b6f5ca49734ed2c33a2a805dcb283fb6f7f8d4c1f5R49
} | ||
r.limit.Forget(item) | ||
return r.inner.Reconcile(ctx, req) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@negz these two downsides seem mostly unavoidable to me unless we want to change a fair amount of the machinery we are consuming from controller-runtime
. That being said, it may be worth thinking about exposing custom metrics for "crossplane controllers" that are more informative. No action required in this PR, and I know you are already working on some of these things, but just thought I would mention while on my mind :)
Description of your changes
This PR tweaks how ratelimiters are applied to support actual global reconcile rate limiting - that is all reconcile triggers are rate limited, not just some.
See crossplane/crossplane#2595 for details.
I have:
make reviewable test
to ensure this PR is ready for review.How has this code been tested
I've built crossplane/crossplane against this change and tested that it works using https://github.com/negz/crossplane-scale. I've tested this with up to 1,000 claims, each with four composed provider-gcp resources. I think there are more improvements that can be made, but that doesn't need to block moving forward here.