[consumer] Allow annotating consumer errors with metadata #9041

evan-bradley · 2023-12-05T03:04:02Z

Description:

Revival of #7439

This explores one possible way to allow adding metadata to errors returned from consumers. The goal here is to allow transmitting more data back up the pipeline if there is an error at some stage, with the goal of it being used by an upstream component, e.g. a component that will retry data, or a receiver that will propagate an error code back to the sender.

The current design eliminates the permanent/retryable error types in favor of a single error type that supports adding signal data to be retried. If no data is added to be retried, the error is considered permanent. Currently there is no distinction made between the signals for the sake of simplicity, the caller should know what signal is used when retrieving the retryable items from the error. Any options for retrying the data (e.g. a delay) are offered as options when adding data to retry.

The error type currently supports a few general metadata fields that are copied when a downstream error is wrapped:

Partial successes can be expressed by setting the number of rejected items.
gRPC and HTTP status codes can be set and translated between if necessary.

Link to tracking Issue:

Resolves #7047

cc @dmitryax

consumer/consumererror/README.md

consumer/consumererror/consumererror.go

jmacd · 2024-01-10T19:11:16Z

Happy to see this added. As discussed in #9260, there is a potential to propagate backwards the information contained in PartialSuccess responses from OTLP exports.

I worry about the code complexity introduced to have "success error" responses, meaning error != nil but the interpretation being success. However, this is what it will take to back-propagate partial successes, we want callers to see success with metadata about the number of rejected points if possible. Great to see this, thanks @evan-bradley.

jmacd · 2024-01-10T19:12:19Z

As discussed in open-telemetry/oteps#238, it would be useful for setting the correct otel.outcome label, for callers to have access to the underlying gRPC and/or HTTP error code. Thanks!

consumer/consumererror/consumererror.go

consumer/consumererror/partial.go

github-actions · 2024-02-22T03:15:22Z

This PR was marked stale due to lack of activity. It will be closed in 14 days.

github-actions · 2024-03-21T03:15:44Z

This PR was marked stale due to lack of activity. It will be closed in 14 days.

…adic arguments (#10041) Call out that unnamed types, e.g. the function signature of an exported function, should not be relied upon by API consumers. In particular, updating a function to be variadic will break users who were depending on that function's signature. #### Link to tracking issue Helps #9041 Co-authored-by: Evan Bradley <evan-bradley@users.noreply.github.com> Co-authored-by: Pablo Baeyens <pablo.baeyens@datadoghq.com> Co-authored-by: Alex Boten <223565+codeboten@users.noreply.github.com>

consumer/consumererror/statuserrors.go

TylerHelmuth · 2024-05-15T21:01:25Z

consumer/consumererror/networkerrors.go

+}
+
+// NewHTTPError wraps an error with a given HTTP status code.
+func NewHTTPError(err error, code int) error {


Should all of the new New* functions for the new error types take options so that we don't have to break the function signature in the future?

I'm split on this. On one hand, that seems like a good idea. On the other, without a clear use-case in mind right now, I'm worried it will end up being dead code. Can you think of any potential options we could add to these errors?

maybe a WithLogger option for network errors if we ever wanted to add debug statements or something. I don't really have a good idea in mind.

With our updated policy on variadic param additions are we covered if we don't do it now?

maybe a WithLogger option for network errors if we ever wanted to add debug statements or something. I don't really have a good idea in mind.

For most cases I could see a logger possibly being helpful, but I think we should avoid putting a logger inside our error structs. Thanks for brainstorming on it.

With our updated policy on variadic param additions are we covered if we don't do it now?

The policy only allows us to avoid the deprecation process, but either way we're covered until 1.0.

Based on @dmitryax's source idea, I'm back to thinking Options would be a good idea.

consumer/consumererror/README.md

consumer/consumererror/networkerrors_test.go

consumer/consumererror/networkerrors.go

linux-foundation-easycla · 2024-05-17T14:39:38Z

The committers listed above are authorized under a signed CLA.

✅ login: evan-bradley / name: Evan Bradley (b0184d5, 803a660, 1cf3f8c, 17ee677, f0bbc7d, 85306c0, fd211f5, 917dab8, 1c74780, 70c81ba, 47d0e5b, 5ce13e8, eeeca28, d309cd9, ceed22d, 2082dfe, 1407649, b308a4b, 8d73200, 55a0835, 5cd4b66, 752e522, 94a7950, ba28301, ecabe12, 107ab02, 35eae2f, 738f162)
✅ login: codeboten / name: Alex Boten (226cb97, 1cb2bf1)

evan-bradley · 2024-05-17T14:53:38Z

@codeboten Any idea what's up with the CLA? I used the "batch suggestions" feature to group the suggestions together. I can rebase the PR and redo those changes if I messed something up.

consumer/consumererror/networkerrors.go

Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>

bogdandrutu · 2024-05-23T13:56:51Z

consumer/consumererror/networkerrors.go

+// between the status codes supported by each transport if necessary.
+//
+// It should be created with NewHTTPStatus or NewGRPCStatus.
+type NetworkError struct {


Why having one struct for both instead of 2? If you want to have 2 structs you can use generics to avoid duplicate code.

I put one struct so users can do errors.As(err, &NetworkError{}) and get back an error. We want only a single type because receivers want status codes for their particular transport regardless of what transport the exporter uses. I think putting methods on an error type will be better for usability than having functions in the consumererror package since we can do a single check to determine whether we have a NetworkError type rather than having to check each time we call a consumererror function.

I agree with sticking with a 1 error type to represent these types of errors as it works best with the errors package and simplifies what components consuming the errors need to do.

If we had separate error types for GRPC and HTTP then any component would have to check both types to see if what underlying transport error is being conveyed. Encapsulating the underlying source in a single struct simplifies that.

Multiple structs that implement an interface also wouldn't work bc the interface wouldn't work with errors.As.

consumer/consumererror/partial.go

consumer/consumererror/networkerrors.go

bogdandrutu · 2024-05-23T14:10:34Z

consumer/consumererror/signalerrors.go

+// e.g. the duration before sending should be retried.
+type RetryOption func(err *retryableCommon)
+
+func WithRetryDelay(delay time.Duration) RetryOption {


Should instead have a standalone RetryDelay error?

We should probably keep a single error for a few reasons:

We agreed that upstream components should use the copy of data returned by the downstream component, so delays without data would be inconsistent with that.

Upstream components can get both the data and retry information out of the error a bit more easily.

These two concepts are pretty closely related; if you want to retry data, you likely also care about how long to wait before retrying it.

…adic arguments (open-telemetry#10041) Call out that unnamed types, e.g. the function signature of an exported function, should not be relied upon by API consumers. In particular, updating a function to be variadic will break users who were depending on that function's signature. #### Link to tracking issue Helps open-telemetry#9041 Co-authored-by: Evan Bradley <evan-bradley@users.noreply.github.com> Co-authored-by: Pablo Baeyens <pablo.baeyens@datadoghq.com> Co-authored-by: Alex Boten <223565+codeboten@users.noreply.github.com>

mx-psi · 2024-06-05T08:25:46Z

~~I will merge this by end of week unless there are further comments~~ (edit: no, see #9041 (comment))

evan-bradley · 2024-06-07T02:50:25Z

I will merge this by end of week unless there are further comments

@mx-psi Thanks for keeping an eye on this. I need to make one more change to this to add "error source" metadata to the transport errors, and I also told @dmitryax I would wait for his review. I'd like to hold off until next week if that's okay.

mx-psi · 2024-06-07T08:18:58Z

Sure!

djaglowski reviewed Dec 5, 2023

View reviewed changes

consumer/consumererror/README.md Show resolved Hide resolved

bogdandrutu reviewed Dec 6, 2023

View reviewed changes

consumer/consumererror/consumererror.go Outdated Show resolved Hide resolved

consumer/consumererror/consumererror.go Outdated Show resolved Hide resolved

consumer/consumererror/consumererror.go Outdated Show resolved Hide resolved

atoulme mentioned this pull request Dec 19, 2023

Stabilize module consumer #9046

Open

8 tasks

mx-psi added this to the `go.opentelemetry.io/collector/component` 1.0 milestone Dec 20, 2023

evan-bradley force-pushed the issue-7047 branch 2 times, most recently from 41e95d2 to d417f44 Compare January 8, 2024 20:40

evan-bradley mentioned this pull request Jan 10, 2024

OTLP PartialSuccess responses should not be interpreted as errors, items should count as "rejected" in pipeline metrics #9243

Open

jmacd mentioned this pull request Jan 10, 2024

Treat PartialSuccess as Success #9260

Merged

mx-psi modified the milestones: `go.opentelemetry.io/collector/component` 1.0, `go.opentelemetry.io/collector/consumer` 1.0 Jan 24, 2024

mx-psi reviewed Jan 29, 2024

View reviewed changes

consumer/consumererror/consumererror.go Outdated Show resolved Hide resolved

consumer/consumererror/consumererror.go Outdated Show resolved Hide resolved

consumer/consumererror/partial.go Show resolved Hide resolved

mx-psi added needed-for-1.0 release:required-for-ga Must be resolved before GA release and removed needed-for-1.0 labels Feb 7, 2024

github-actions bot added the Stale label Feb 22, 2024

evan-bradley added 3 commits March 5, 2024 23:32

Add extensible consumererror.Error type

fd211f5

Improve memory allocations

85306c0

Experiment with an errors.Join-based implementation

2082dfe

evan-bradley force-pushed the issue-7047 branch from d417f44 to 2082dfe Compare March 6, 2024 04:32

Address PR feedback

917dab8

evan-bradley removed the Stale label Mar 6, 2024

github-actions bot added the Stale label Mar 21, 2024

evan-bradley removed the Stale label Apr 2, 2024

codeboten and others added 3 commits April 3, 2024 08:17

Merge branch 'main' into issue-7047

226cb97

Merge branch 'main' into issue-7047

1cb2bf1

Updates

8d73200

evan-bradley mentioned this pull request Apr 29, 2024

[chore] Allow sometimes skipping deprecation process when adding variadic arguments #10041

Merged

evan-bradley added 2 commits May 13, 2024 11:28

Merge remote-tracking branch 'upstream/main' into issue-7047

5cd4b66

Add breaking changelog entry for retryable errors

70c81ba

TylerHelmuth reviewed May 13, 2024

View reviewed changes

consumer/consumererror/statuserrors.go Outdated Show resolved Hide resolved

Rename status errors to network errors

eeeca28

TylerHelmuth reviewed May 15, 2024

View reviewed changes

codeboten reviewed May 15, 2024

View reviewed changes

codeboten reviewed May 17, 2024

View reviewed changes

consumer/consumererror/networkerrors.go Outdated Show resolved Hide resolved

consumer/consumererror/networkerrors.go Outdated Show resolved Hide resolved

consumer/consumererror/networkerrors.go Outdated Show resolved Hide resolved

TylerHelmuth approved these changes May 17, 2024

View reviewed changes

Apply suggestions from code review

1c74780

evan-bradley force-pushed the issue-7047 branch from 1dab1b3 to 1c74780 Compare May 17, 2024 20:03

mx-psi approved these changes May 17, 2024

View reviewed changes

consumer/consumererror/networkerrors.go Outdated Show resolved Hide resolved

evan-bradley and others added 2 commits May 17, 2024 16:06

Fix naming

752e522

Update consumer/consumererror/networkerrors.go

738f162

Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>

codeboten approved these changes May 21, 2024

View reviewed changes

Merge branch 'main' into issue-7047

ba28301

bogdandrutu reviewed May 23, 2024

View reviewed changes

evan-bradley added 3 commits May 29, 2024 10:12

Merge remote-tracking branch 'upstream/main' into issue-7047

ecabe12

Address PR feedback

35eae2f

Format godoc comments

107ab02

mx-psi requested a review from bogdandrutu June 5, 2024 08:25

atoulme approved these changes Jun 6, 2024

View reviewed changes

evan-bradley requested a review from dmitryax June 7, 2024 02:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[consumer] Allow annotating consumer errors with metadata #9041

[consumer] Allow annotating consumer errors with metadata #9041

evan-bradley commented Dec 5, 2023

jmacd commented Jan 10, 2024

jmacd commented Jan 10, 2024

github-actions bot commented Feb 22, 2024

github-actions bot commented Mar 21, 2024

TylerHelmuth May 15, 2024

evan-bradley May 17, 2024

TylerHelmuth May 17, 2024

evan-bradley May 17, 2024

TylerHelmuth May 29, 2024

linux-foundation-easycla bot commented May 17, 2024 •

edited

evan-bradley commented May 17, 2024

bogdandrutu May 23, 2024

evan-bradley May 29, 2024

TylerHelmuth May 29, 2024

bogdandrutu May 23, 2024

evan-bradley May 29, 2024

mx-psi commented Jun 5, 2024 •

edited

evan-bradley commented Jun 7, 2024

mx-psi commented Jun 7, 2024

[consumer] Allow annotating consumer errors with metadata #9041

Are you sure you want to change the base?

[consumer] Allow annotating consumer errors with metadata #9041

Conversation

evan-bradley commented Dec 5, 2023

jmacd commented Jan 10, 2024

jmacd commented Jan 10, 2024

github-actions bot commented Feb 22, 2024

github-actions bot commented Mar 21, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

linux-foundation-easycla bot commented May 17, 2024 • edited

evan-bradley commented May 17, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mx-psi commented Jun 5, 2024 • edited

evan-bradley commented Jun 7, 2024

mx-psi commented Jun 7, 2024

linux-foundation-easycla bot commented May 17, 2024 •

edited

mx-psi commented Jun 5, 2024 •

edited