Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[chore][docs/rfc] Add RFC on configuring confmap Providers #10121

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

evan-bradley
Copy link
Contributor

Description

Documents how we want to configure confmap Providers, with the default decision being to use URI fragments.

Per the SIG discussion on 2024-05-08, I have included as many alternatives as I could think of. I think a few of them have merit, but I don't think any of the options are without downsides.

@evan-bradley evan-bradley requested a review from a team as a code owner May 8, 2024 19:51
Copy link

codecov bot commented May 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.48%. Comparing base (c6b70a7) to head (f566140).
Report is 75 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10121      +/-   ##
==========================================
+ Coverage   91.67%   92.48%   +0.81%     
==========================================
  Files         362      387      +25     
  Lines       16754    18244    +1490     
==========================================
+ Hits        15359    16873    +1514     
+ Misses       1056     1025      -31     
- Partials      339      346       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

1. Broadly applicable to all current upstream providers.
2. Can be used as a suggestion for configuring custom providers.
3. Is consistent between Providers.
4. Is configurable per config URI.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there things we want to configure per-provider? i.e. is there a need for a 'configure this for all invocations of a given provider' vs a 'configure this for this specific usage of the provider'? I am thinking about something like the TLS certificate bundle you want to use for the HTTPS provider. To be clear, for this particular case, we have ways to 'configure' that today (install your extra certificate at the system level, handle HTTPS outside of the Collector...), but I am wondering if this pattern exists outside of it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably something we should at least consider. I think it would be reasonable to allow users to specify options that will apply to all URI resolutions instead of having to configure every resolution individually. In addition to the case you called out, I can think of a couple cases where we may want this:

  • HTTP(S) provider: Configure any client settings: timeouts, certificates, headers, etc.
  • env provider: Behavior for when env vars aren't set

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a section on this.

Comment on lines 51 to 54
Providers are invoked through passing `--config` flags to the Collector binary
with a scheme and URI to obtain. A single instance of each Provider is then
tasked with retrieving from config from all URIs passed to the Collector for the
scheme the provider is registered to handle.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also use providers within a configuration file, right? We need to consider both cases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, we do need to consider both cases. I think all options would work in both contexts right now, though with varying UX quality.

docs/rfcs/confmap-provider-configuration.md Outdated Show resolved Hide resolved

- Explicitly intended to specify non-hierarchical data in a URI.
- Often used for this purpose.
- Fits into existing config URIs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how well it fits with the env provider, I would expect people to want to use Bash-like syntax or Powershell-like syntax there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I added a note to make this clear.

Comment on lines +111 to +117
We could likely partially circumvent the key-value pair limitation by
recursively calling confmap Providers to resolve files, env vars, HTTP URLs,
etc. For example:

```text
https://config.com/config#refresh-interval=env:REFRESH_INTERVAL&headers=file:headers.yaml
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do this since #7504 with the following syntax:

https://config.com/config#refresh-interval=${env:REFRESH_INTERVAL}&headers=${file:headers.yaml}

At least, within a configuration file, I am not sure about CLI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the braces syntax only applies to URIs in files and doesn't work in the CLI. The reason I simplified the syntax a little was to avoid requiring users to escape characters that a shell may consume like $. On the other hand, we need some obvious way to indicate that these are URIs for extra config and not values in themselves. I'm not sure the way I have it written right now will suffice for that.

We also need to consider the way the substitution works for these options. If we have the confmap Resolver do resolution for embedded config URIs, headers.yaml would need to be marshaled into a single line, such that this:

https://config.com/config#refresh-interval=${env:REFRESH_INTERVAL}&headers=${file:headers.yaml}

would resolve to this:

https://config.com/config#refresh-interval=10s&headers=['Api-Token':'0xdeadbeef' 'Accept-Encoding':'text/yaml']

The Resolver would need to know when to "stringify" files like this vs. when to substitute the direct contents of a file. Or we just require users to write the stringified version, which simplifies things for us at the cost of worse UX since in my opinion this representation would be unpleasant to work with.

My assumption was that we implement custom unmarshaling logic for the URI fragment options that takes all of this into account. So ${file:headers.yaml} is unmarshaled directly into a map that is assigned to a Headers field on a Provider's config struct.

currently only takes valid URIs, and updating the format to accommodate this
would require we adopt an unconventional format.

### Separate flags to configure Providers per config URI
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not entirely opposed to allow configuring things via flags, but if we ever do this I would do that with a flag provider that you can call like ${flag:my-flag-name} and it would give you the (YAML-parsed?) value of --my-flag-name=val (or possibly --some-namespace-here-my-flag-name).

This would have to piggyback onto some other solution that allows passing this ${flag:my-flag-name} value though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were to take this approach, how could we determine which flags are valid? It sounds like flag names could be arbitrary, so would passing --my-unsupported-flag cause an error, or just store a value in the flag provider that goes unused? Or am I entirely misunderstanding your suggestion?


## Resolution

We will configure providers through URI fragments. These are seldom used in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to explicitly exclude the env provider from here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I called this out in an "Exceptions" section above, let me know if that covers it.

Copy link
Member

@TylerHelmuth TylerHelmuth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the RFC include any prior art on this type of configuration? My guess is that it'll be cli flags that apply to all Providers.

each URI and is suboptimal UX.
- Complicating the flags like this would be suboptimal UX.

### Separate config file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't gotten to read this whole RFC yet, but I very much want to avoid this option as I think it will be pretty confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, a config file to configure a config file isn't ideal.

@evan-bradley
Copy link
Contributor Author

Can the RFC include any prior art on this type of configuration?

I spent some time thinking about this, but I'm not aware of anything that matches what we're doing closely enough to make sense. The best I can think of would be infrastructure as code tools like Terraform, Pulumi, or CloudFormation, but I think what they are doing is sufficiently different that it's hard to use them as a basis. If there are other tools or approaches you think would be worth investigating, let me know and I can take a look.

I can also do a writeup on how our configuration fits into the larger landscape of infrastructure configuration approaches if you think it would be beneficial to take a broader look at our whole approach, but didn't want to detract from the more practical discussion at hand of how to do things like authenticate requests in our HTTP providers.

Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

Successfully merging this pull request may close these issues.

None yet

3 participants