-
Notifications
You must be signed in to change notification settings - Fork 25.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix double-pausing shard snapshot #109148
Fix double-pausing shard snapshot #109148
Conversation
Closes elastic#109143
Hi @DaveCTurner, I've created a changelog YAML for you. |
Pinging @elastic/es-distributed (Team:Distributed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I think this fixes ES-8563 as well. Could you please cross link between this PR and ES-8563? Thanks!
if (updatedState.state() == ShardState.PAUSED_FOR_NODE_REMOVAL) { | ||
// leave subsequent entries for this shard alone until this one is unpaused | ||
iterator.remove(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still feels a bit odd that tryStartNextTaskAfterSnapshotUpdated
can init shard snapshot without checking whether this shard is already active in snapshots that come before it. I think it relied on the fact that the possible updated states sent by data nodes are all "non-active" which is true till we add PAUSED_FOR_NODE_REMOVAL
. Theoretically this could happen again in future if we add another similar ShardState. That said, I can see it would be quite a lot work to change things in tryStartNextTaskAfterSnapshotUpdated
. So I am good with this fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++ valid point, I think we can at least add an assertion to catch a future bug in this area - see 615cebf.
💚 Backport successful
|
Closes #109143