Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enforces ordering on concurrent subscription to FluxRefCount #3517

Merged
merged 4 commits into from
Jul 4, 2023

Conversation

OlegDokuka
Copy link
Contributor

@OlegDokuka OlegDokuka commented Jun 26, 2023

Late subscribe may observe zero events when multiple inners subscribe to FluxRefCount simultaneously so the one which does not invoke the connect method subscribes to the FluxPublish too late (e.g. the one which calls the connect subscribe faster and invoke the connect method earlier than the others):

 subscribe 1           subscriber 2
     |                      |
count + 1 (1)               |
     |                 count + 1 (2)
     |                      |
     |                  subscribe
     |                      |
     |                   connect
     |                  onComplete
  subscribe
  (hanging)

to solve the problem subscribe should happen before the count is incremented so we have strong ordering guaranteed by the synchronize block

fixes #3487

Signed-off-by: OlegDokuka <odokuka@vmware.com>
Late subscribe may observe zero events when multiple inners subscribes to FluxRefCount simultaneously so the one which does not invoke connect method subscribes to the FluxPublish too late (e.g. the one which call connect subscribe faster and invoke connect method earlier)
```
 subscribe 1           subscriber 2
     |                      |
   count + 1 (1)            |
     |                 count + 1 (2)
     |                      |
     |                  subscribe
     |                      |
     |                   connect
     |                  onComplete
  subscribe
  (hanging)
```

to solve the problem subscribe should happen before count is incremented so we have strong ordering
Signed-off-by: OlegDokuka <odokuka@vmware.com>
@OlegDokuka OlegDokuka requested a review from a team as a code owner June 26, 2023 17:35
@OlegDokuka OlegDokuka added the type/bug A general bug label Jun 26, 2023
@OlegDokuka OlegDokuka added this to the 3.4.31 milestone Jun 26, 2023
Signed-off-by: OlegDokuka <odokuka@vmware.com>
@OlegDokuka OlegDokuka changed the title adds stress test and initial minimal fix enforces ordering on concurrent subscription to FluxRefCount Jun 26, 2023
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps adding corresponding cases for Flux.error(Throwable) in the fusion/hide/delay variants to also validate how error is handled could be worthwhile?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


RefCountInner(CoreSubscriber<? super T> actual, RefCountMonitor<T> connection) {
volatile int state; //used to guard against doubly terminating subscribers (e.g. double cancel)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the comment is no longer relevant, as the state represents more situations and before it was just about termination/cancellation, while now it also represents the readiness (monitor set). Perhaps we can remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@gowa
Copy link

gowa commented Jul 3, 2023

@OlegDokuka , if you wish to know whether the tests I wrote pass with your fix, I can confirm they do! :)

Thanks!

Signed-off-by: OlegDokuka <odokuka@vmware.com>
@OlegDokuka OlegDokuka requested a review from chemicL July 4, 2023 06:13
@OlegDokuka OlegDokuka merged commit fdc801e into 3.4.x Jul 4, 2023
7 checks passed
@reactorbot
Copy link

@OlegDokuka this PR seems to have been merged on a maintenance branch, please ensure the change is merge-forwarded to intermediate maintenance branches and up to main 🙇

@OlegDokuka OlegDokuka deleted the 3.4.x-3487-2 branch July 4, 2023 10:52
OlegDokuka added a commit that referenced this pull request Jul 4, 2023
Signed-off-by: OlegDokuka <odokuka@vmware.com>
OlegDokuka added a commit that referenced this pull request Jul 4, 2023
Signed-off-by: OlegDokuka <odokuka@vmware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug A general bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants