New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Log a warning when an environment variable without a value is expanded #5615
Comments
Another possible "fix" for this is to log the full collector config at startup. IIRC this is what telegraf does. This would allow the user to see that their config is messed up. One major issue with this solution is that you would need to reliably hide secrets while also making it possible to see if an expansion failed. We currently don't have any way to distinguish between expansions that are secret and should be obfuscated in logs and expansions that are not. |
I think we should fail fast instead of logging a warning. This is a breaking change, but I think it's worth it in terms of improved troubleshooting and we can have a feature gate to make the transition smoother. We probably also want to prioritize implementing #5228, which will be useful in this case. For printing the configuration, we have #5223. If you have ideas on how to reliably hide secrets please participate in the discussion there! |
I also think a fail-fast strategy is a good one. I didn't suggest it initially because I didn't know how conservative the collector SIG tends to be about those types of breaks. One possible midway solution is to have a strict mode which would toggle behavior between warn and failure edit: a strict mode may also allow other similar enhancements such as failing to start up when an unknown config key is present which may catch typos in some cases |
We already do this by default, so it makes all the more sense to be consistent with environment variables :) If I find some time this week I will try to address this |
I started making a PoC for raising a warning (and failing if enabled through a feature gate) on #5734. Since the logger is not available when loading the configuration, we have to signal the need to log a warning in some way. On the PR, I did this with a special error type ('NonFatalError') that needs to be handled by all configuration-related functions. There are other alternatives, like changing providers/converters signature to return some sort of 'diagnostics' struct that includes any warnings, or giving up on logging a warning altogether. @open-telemetry/collector-approvers What do you think is the best approach? |
I am adding this to the confmap milestone, we need to discuss if we want to change the behavior or if we want to make any changes on the Go API to allow for non fatal errors before reaching 1.0. |
Not a fan if the idea of failing if environment expansion encounters an unset value. That may or may not be an error as a user may wish to use that to add context to a value rather than be the only content of a value. Logging that a replacement failed makes sense and would be a good addition. |
Would it be possible to support that through support for setting a default value if unset? (Not sure if we should do that, I am just trying to understand your point here) |
Maybe, but in that case I'd argue that the default default value should be an empty string which takes us right back to where we are today. |
…nd providers (#9443) **Description:** For both #5615 and #9162 we need to be able to log during the confmap resolution. My proposed solution is to pass a `*zap.Logger` to converters and providers during initialization. These components can then use this to log any warnings they need. This PR does the first step: being able to pass anything to converters and providers during initialization. The obvious alternative to this is to change the interface of `confmap.Provider` and `confmap.Converter` to pass any warnings in an explicit struct. I think the `*zap.Logger` alternative is more natural for developers of providers and converters: you just use a logger like everywhere else in the Collector. One problem for the Collector usage of `confmap` is: How does one pass a `*zap.Logger` before knowing how a `*zap.Logger` should be configured? I think we can work around this by: 1. Passing a special 'deferred' Logger that just stores the warnings without actually logging them (we can use something like `zaptest/observer` for this) 2. Resolving configuration 3. Building a `*zap.Logger` with said configuration 4. Logging the entries stored in (1) with the logger from (3) (using `zaptest/observer` we can do that by taking the `zapcore.Core` out of the logger and manually writing) **We don't actually need ProviderSettings today, just ConverterSettings, but I think it can still be useful.** **Link to tracking Issue:** Relates to #5615 and #9162 --------- Co-authored-by: Evan Bradley <11745660+evan-bradley@users.noreply.github.com>
This was discussed in SIG meeting 3/13 as part of the open-telemetry/opentelemetry-collector-contrib#9984 issue review. |
…9837) **Description:** Creates a logger in the confmap.ProviderSettings and uses it to log when there is a missing or blank environment variable referenced in config. For now the noop logger is used everywhere except tests. **Link to tracking Issue:** [5615](#5615) **Testing:** I wrote unit tests that ensured 1. logging occurred when an environment variable was unset 2. logging occcured when the env var was empty. 3. there was no log when an env var was used correctly I also started the otel collector with the sample config - and added an env var reference in the sample config. I then inserted a print statement next to each log call to see whether my code paths were hit in the live application. I then went through the 3 cases mentioned above and ensured that logging behavior was accurate.
@TylerHelmuth No, because the logger is noop. We need to pass a functioning logger, which is still a WIP |
#### Description This is an RFC to help us decide how we want `otelcol` to provide a logger before the primary logger is created. As we discuss I will update the doc. Before this is merged we should have decided on a solution and the Accepted Solution section must be updated. Related to #10056 #### Link to tracking issue This unblocks: - #9162 - #5615 --------- Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com> Co-authored-by: Evan Bradley <11745660+evan-bradley@users.noreply.github.com>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description Provides a logger to confmap that buffers logs in memory until the primary logger can be used. Once the primary logger exists, places that used the original logger are given the updated Core. If an error occurs that would shut down the collector before the primary logger could be created, the logs are written to stdout/err using a fallback logger. Alternative to #10008 I've pushed the testing I did to show how the logger successfully updates. Before config resolution the debug log in confmap is not printed, but afterwards it is. test config: ```yaml receivers: nop: exporters: otlphttp: endpoint: http://0.0.0.0:4317 headers: # Not set x-test: ${env:TEMP3} debug: # set to "detailed" verbosity: $TEMP service: telemetry: logs: level: debug pipelines: traces: receivers: - nop exporters: - debug ``` ![image](https://github.com/open-telemetry/opentelemetry-collector/assets/12352919/6a17993f-1f97-4c54-9165-5c34dd58d108) <!-- Issue number if applicable --> #### Link to tracking issue Related to #9162 Related to #5615 <!--Describe what testing was performed and which tests were added.--> #### Testing If we like this approach I'll add tests <!--Describe the documentation added.--> #### Documentation --------- Co-authored-by: Dan Jaglowski <jaglows3@gmail.com> Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>
Related to #5614
Is your feature request related to a problem? Please describe.
In open-telemetry/opentelemetry-collector-contrib#11846 the user was trying to use
"$ConnectionString"
as a Username. The collector automatically interprets$
as an environment variable expansion. The environment variable was empty, so theUsername
field was interpreted as empty and failed validation. There was nothing logged in the collector to alert this user that their configuration was being mangled by the failed environment expansion.Describe the solution you'd like
When an environment variable expansion has no value, log a warning.
The text was updated successfully, but these errors were encountered: