Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix] Fix Infinite Loop in Reader's HasNext Function #1182

Merged
merged 8 commits into from Feb 28, 2024

Conversation

RobertIndie
Copy link
Member

@RobertIndie RobertIndie commented Feb 23, 2024

Fixes #1171

Motivation

If getLastMessageId continually fails, the reader.HasNext can get stuck in an infinite loop. Without any backoff, the reader would keep trying forever.

Modifications

  • Implemented a backoff policy for getLastMessageID.
  • If HasNext fails, it now returns false.

Should the reader.HasNext returned false in case of failure?

Currently, the HasNext method doesn't report errors. However, failure is still possible. For instance, if getLastMessageID repeatedly fails and hits the retry limit. An option is to keep trying forever, but this would stall all user code. This isn't user-friendly, so I rejected this solution.

Couldn't utilize the BackOffPolicy in the Reader Options

The HasNext retry mechanism requires to use of IsMaxBackoffReached for the backoff. But it isn't exposed in the BackOffPolicy interface. Introducing a new method to the BackOffPolicy would introduce breaking changes for the user backoff implementation. So, I choose not to implement it. Before we do it, we need to refine the BackOffPolicy.

Verifying this ch

This change added tests.

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API: (yes / no)
  • The schema: (yes / no / don't know)
  • The default values of configurations: (yes / no)
  • The wire protocol: (yes / no)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / GoDocs / not documented)
  • If a feature is not applicable for documentation, explain why?
  • If a feature is not documented yet in this PR, please create a followup issue for adding the documentation

@RobertIndie RobertIndie self-assigned this Feb 23, 2024
@RobertIndie RobertIndie marked this pull request as ready for review February 23, 2024 10:22
Copy link
Contributor

@BewareMyPower BewareMyPower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better idea might be modifying the readNext API.

     HasNext() (bool, error)

According to the module version numbering,

Automatic pseudo-version number
v0.x.x: Signals that the module is still in development and unstable. This release carries no backward compatibility or stability guarantees.

@RobertIndie
Copy link
Member Author

A better idea might be modifying the readNext API.

     HasNext() (bool, error)

This would require changes in users' code. I'm considering cherry-picking this PR to branch-0.12. Although the 'Module version numbering' doesn't clearly state the guarantee for the in-development minor version like 0.X.Y, I think It's better that we don't introduce any user code changes required in 0.12.1.

Both these options introduce the behavior change. While the current PR's approach isn't ideal, it solves the problem without requiring users to change their code.

I could start a discussion to design a better API for the next major version and for the 1.0.0.

pulsar/reader_test.go Outdated Show resolved Hide resolved
pulsar/reader_test.go Outdated Show resolved Hide resolved
pulsar/reader_test.go Outdated Show resolved Hide resolved
pulsar/consumer_partition.go Outdated Show resolved Hide resolved
@RobertIndie
Copy link
Member Author

@BewareMyPower After reevaluating, I've concluded that we shouldn't check for maxBackoff in the backoff policy. Similar to the Java client, the maximum timeout should be regulated by the ClientOptions.OperationTimeout.

I have also chosen a new way to simulate the failure of getting the last message ID. It would make the test more stable.

And also, I found that there are some issues with the BackoffPolicy. I tracked it in another issue: #1187.

@BewareMyPower
Copy link
Contributor

the maximum timeout should be regulated by the ClientOptions.OperationTimeout.

Yes, that's right.

@BewareMyPower BewareMyPower merged commit 88a8d85 into apache:master Feb 28, 2024
8 checks passed
RobertIndie added a commit that referenced this pull request Feb 29, 2024
Fixes #1171

### Motivation

If `getLastMessageId` continually fails, the reader.HasNext can get stuck in an infinite loop. Without any backoff, the reader would keep trying forever.

### Modifications

- Implemented a backoff policy for `getLastMessageID`.
- If HasNext fails, it now returns false.

#### Should the reader.HasNext returned `false` in case of failure?

Currently, the `HasNext` method doesn't report errors. However, failure is still possible. For instance, if `getLastMessageID` repeatedly fails and hits the retry limit. An option is to keep trying forever, but this would stall all user code. This isn't user-friendly, so I rejected this solution.

#### Couldn't utilize the BackOffPolicy in the Reader Options

The `HasNext` retry mechanism requires to use of `IsMaxBackoffReached` for the backoff. But it isn't exposed in the `BackOffPolicy` interface. Introducing a new method to the `BackOffPolicy` would introduce breaking changes for the user backoff implementation. So, I choose not to implement it. Before we do it, we need to refine the BackOffPolicy.

(cherry picked from commit 88a8d85)
@shadygrove
Copy link

Thank you for the fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reader HasNext() enters infinite loop
3 participants