Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support true global reconcile rate limiting #294

Merged
merged 3 commits into from
Oct 25, 2021

Conversation

negz
Copy link
Member

@negz negz commented Sep 24, 2021

Description of your changes

This PR tweaks how ratelimiters are applied to support actual global reconcile rate limiting - that is all reconcile triggers are rate limited, not just some.

See crossplane/crossplane#2595 for details.

I have:

  • Read and followed Crossplane's contribution process.
  • Run make reviewable test to ensure this PR is ready for review.

How has this code been tested

I've built crossplane/crossplane against this change and tested that it works using https://github.com/negz/crossplane-scale. I've tested this with up to 1,000 claims, each with four composed provider-gcp resources. I think there are more improvements that can be made, but that doesn't need to block moving forward here.

This PR tweaks how ratelimiters are applied to support _actual_ global reconcile
rate limiting - that is all reconcile triggers are rate limited, not just some.

See crossplane/crossplane#2595 for details.

Signed-off-by: Nic Cope <negz@rk0n.org>
}
r.limit.Forget(item)
return r.inner.Reconcile(ctx, req)
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two downsides to this approach:

  1. It rate limits work queue pops, not pushes. There's an increased chance of unbounded queue growth.
  2. It artificially inflates the controller_runtime_reconcile_total metric, since some 'reconciles' are rate limited and sent immediately back to the work queue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW in practice I haven't seen this being an issue during my scale testing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@negz these two downsides seem mostly unavoidable to me unless we want to change a fair amount of the machinery we are consuming from controller-runtime. That being said, it may be worth thinking about exposing custom metrics for "crossplane controllers" that are more informative. No action required in this PR, and I know you are already working on some of these things, but just thought I would mention while on my mind :)

@@ -63,6 +63,6 @@ type Options struct {
func (o Options) ForControllerRuntime() controller.Options {
return controller.Options{
MaxConcurrentReconciles: o.MaxConcurrentReconciles,
RateLimiter: ratelimiter.NewController(o.GlobalRateLimiter),
RateLimiter: ratelimiter.NewController(),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is each controller uses NewController to produce a per-item exponential backoff controller to rate limit reconciles that return Requeue: true or a non-nil error.

The NewGlobal reconciler is passed to ratelimiter.NewReconciler to rate limit all reconciles.

name string
inner reconcile.Reconciler
limit ratelimiter.RateLimiter
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see a pretty good case for this being in pkg/reconciler package instead. We have a practice of using subpackages there to avoid collisions so it could either be ratelimited.Reconciler in pkg/reconciler/ratelimited or we could make pkg/reconciler a package and have it be reconciler.RateLimited. No strong feelings really - here feels pretty good too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally am a fan of having it here as I consider it as rate limiter that happens to be implemented as a reconciler 👍🏻

pkg/ratelimiter/reconciler.go Outdated Show resolved Hide resolved
@negz negz force-pushed the re-re-re-reconcile branch from 99a8505 to f7ed086 Compare September 25, 2021 02:41
Signed-off-by: Nic Cope <negz@rk0n.org>
@negz negz marked this pull request as ready for review October 19, 2021 19:57
@negz negz requested review from hasheddan and muvaf October 19, 2021 20:24
@negz
Copy link
Member Author

negz commented Oct 22, 2021

@muvaf @hasheddan ping. Please don't review this on the weekend, but I would really like to get unblocked here.

Copy link
Member

@hasheddan hasheddan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this improvement @negz and apologies for the long delay in review! A few notes below, but LGTM!

name string
inner reconcile.Reconciler
limit ratelimiter.RateLimiter
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally am a fan of having it here as I consider it as rate limiter that happens to be implemented as a reconciler 👍🏻

// for handling requests that have not been and will not be rate limited without
// blocking.
func (r *Reconciler) when(req reconcile.Request) time.Duration {
item := r.name + req.String()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: just noting that in the only place we call when() above, we have already generated item so it could either be passed or we could opt to move it to after the when() call in the caller:
https://github.com/crossplane/crossplane-runtime/pull/294/files#diff-6a81a551d3f3b9bd6b0060b6f5ca49734ed2c33a2a805dcb283fb6f7f8d4c1f5R49

}
r.limit.Forget(item)
return r.inner.Reconcile(ctx, req)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@negz these two downsides seem mostly unavoidable to me unless we want to change a fair amount of the machinery we are consuming from controller-runtime. That being said, it may be worth thinking about exposing custom metrics for "crossplane controllers" that are more informative. No action required in this PR, and I know you are already working on some of these things, but just thought I would mention while on my mind :)

@negz negz merged commit 67edf4a into crossplane:master Oct 25, 2021
@negz negz deleted the re-re-re-reconcile branch October 25, 2021 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants