New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
log: split streams also for text output #114672
Conversation
92f97f3
to
cf23703
Compare
This PR is now complete and ready for review and merging. @kaisoz: you wanted to learn more about logging. This PR shows off some of the machinery that we have for it in Kubernetes, but admittedly it throws you in at the deep end... @mengjiao-liu : you are better suited to do the actual review. /assign @mengjiao-liu I would have asked @serathius for a review as he originally suggested implementing split output, but he probably has no time. As this PR claims to improve performance, let's run benchmarks. The base is commit a15bda8, without changes to text logging. When comparing against the new implementation with
Comparing The most striking difference is With
There's no mutex locking involved at all. Previously with klog, the path is more convoluted:
In particular, Somewhere I also managed to remove some allocations (only benchmarked for this case, with less performance improvements when using
The The diff --git a/test/integration/logs/benchmark/benchmark_test.go b/test/integration/logs/benchmark/benchmark_test.go
index abc92e55531..da01b404247 100644
--- a/test/integration/logs/benchmark/benchmark_test.go
+++ b/test/integration/logs/benchmark/benchmark_test.go
@@ -77,11 +77,16 @@ func BenchmarkEncoding(b *testing.B) {
// To make the tests a bit more realistic, at
// least do system calls during each write.
output := newBytesWritten(b, "/dev/null")
+ output2 := newBytesWritten(b, "/dev/null")
c := logsapi.NewLoggingConfiguration()
c.Format = format
+ c.Options.Text.SplitStream = true
+ if err := c.Options.Text.InfoBufferSize.Set("1Mi"); err != nil {
+ b.Fatalf("set info buffer size: %v", err)
+ }
o := logsapi.LoggingOptions{
ErrorStream: output,
- InfoStream: output,
+ InfoStream: output2,
}
klog.SetOutput(output)
defer func() { Using
However, here it is a bit suspicious that JSON also runs faster. Perhaps I made a mistake somewhere with my invocations or my machine wasn't fully idle during some run. Overall I think this is a good step forward and worth merging. |
Thanks @pohly ! I'll have a look, this will definitely help me to learn and go deeper 👍🏻 |
Copy that. Let me look into it. |
fs.BoolVar(&c.Options.Text.SplitStream, "log-text-split-stream", false, "[Alpha] In text format, write error messages to stderr and info messages to stdout. The default is to write a single stream to stdout. Enable the LoggingAlphaOptions feature gate to use this.") | ||
fs.Var(&c.Options.Text.InfoBufferSize, "log-text-info-buffer-size", "[Alpha] In text format with split output streams, the info messages can be buffered for a while to increase performance. The default value of zero bytes disables buffering. The size can be specified as number of bytes (512), multiples of 1000 (1K), multiples of 1024 (2Ki), or powers of those (3M, 4G, 5Mi, 6Gi). Enable the LoggingAlphaOptions feature gate to use this.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the code implementation point of view, I do not see the feature gate LoggingAlphaOptions
restrictions on these fields c.Options.Text.SplitStream
and c.Options.Text.InfoBufferSize
, are these missing? Or did I miss something?
I only found validation for the JSON
format, not Text
format.
kubernetes/staging/src/k8s.io/component-base/logs/api/v1/options.go
Lines 195 to 203 in a4a5670
func validateJSONOptions(c *LoggingConfiguration, featureGate featuregate.FeatureGate, fldPath *field.Path) field.ErrorList { | |
errs := field.ErrorList{} | |
if gate := LoggingAlphaOptions; c.Options.JSON.SplitStream && !featureEnabled(featureGate, gate) { | |
errs = append(errs, field.Forbidden(fldPath.Child("splitStream"), fmt.Sprintf("Feature %s is disabled", gate))) | |
} | |
if gate := LoggingAlphaOptions; c.Options.JSON.InfoBufferSize.Value() != 0 && !featureEnabled(featureGate, gate) { | |
errs = append(errs, field.Forbidden(fldPath.Child("infoBufferSize"), fmt.Sprintf("Feature %s is disabled", gate))) | |
} | |
return errs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, that was missing for text options. Fixed.
I copied the code although it's the same because (theoretically) the options might progress towards beta at a different pace and/or other options might be added to the structs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ready for LGTM now?
There is a sentence in the release note: "Performance is better also without split streams." Is this sentence wrong? Judging from the overall description and the performance test results you posted, the performance is better after splitting the streams. |
This replaces the klog formatting and message routing with a simpler implementation that uses less code. The main difference is that we skip the entire unused message routing. Instead, the same split output streams as for JSON gets implemented in the io.Writer implementation that gets passed to the textlogger.
a15bda8
to
8f4c9c7
Compare
Tests cases like |
That makes sense. |
This PR may require API review. If so, when the changes are ready, complete the pre-review checklist and request an API review. Status of requested reviews is tracked in the API Review project. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
LGTM label has been added. Git tree hash: 960fb3f875bada4474eedb79814bb260ac7f95b9
|
/assign @thockin For component-base/logging API + implementation approval. |
Can you explain the motivation for this? I, perhaps naively, thought that our goal was to replace klog as an API with logr or slog, and klog as an implementation with some other log library which is actively maintained. I am reticent to make klog better and better - it's not our "core competency".
How much better, and if so - why are we doing this? |
This is not a goal of any KEP at the moment. I keep thinking about it from time to time, but I'm always coming to the conclusion that it's going to cause breakage that isn't justified by the benefits. If we were to replace klog as we know it today, then this PR is a step in that direction. That's because the actual code that then gets used from klog gets reduced to just textlogger and the shared output formatting under internal. We no longer need most of klog.go. There are other parts in klog that we would need to keep (like The idea that I keep coming back to is a klog v3 with most of
klog is actively maintained. It has to be, because it is so widely used. I also find the text output more useful than JSON and the slog text output, so I think Kubernetes should keep supporting both JSON and klog text output.
See #114672 (comment) above. We are doing this because the text output is being used in Kubernetes. |
Thanks! /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: pohly, thockin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Buffering info messages and letting error messages be written without first flushing info messages increases performance. This was previously implemented only for JSON, but some users may also want it for text.
The other aspect of the PR is to completely replace legacy code in klog with a cleaner, simpler implementation. The k8s.io/klog/v2/textlogger does that. When it gets installed as logger in klog, message routing is as follows:
Without it, legacy klog code is involved:
Special notes for your reviewer:
All opens were resolved:
Does this PR introduce a user-facing change?
/wg structured-logging
/sig instrumentation