Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Akka.Cluster.Sharding: potential wire format problem when upgrading from v1.4 to v1.5 with state-store-mode=ddata and remember-entities=on #6704

Closed
Aaronontheweb opened this issue May 2, 2023 · 6 comments · Fixed by #6775

Comments

@Aaronontheweb
Copy link
Member

Version Information
Version of Akka.NET? v1.4 --> v1.5
Which Akka.NET Modules? Akka.Cluster.Sharding

Describe the bug

When upgrading an existing Akka.NET cluster that has remember-entities=on and state-store-mode=ddata, the following error can appear on the v1.4 nodes:

image

This appears to be a wire format issue - the user who reported this issue hadn't materially changed any of their code at all.

To Reproduce

We need to reproduce this to see if this is an oversight in our management of the wire format for v1.5 for this specific group of users.

Steps to reproduce the behavior:

  1. Clone https://github.com/petabridge/akkadotnet-code-samples/tree/master/src/clustering/sharding-sqlserver
  2. Configure to run in DData mode using v1.4 binaries with remember-entities=on - populate some data (https://github.com/petabridge/akkadotnet-code-samples/blob/8ff65149ee4cbdbcadd6f5c39ff2716a343d9f61/src/clustering/sharding-sqlserver/SqlSharding.Host/Program.cs#L74-L76)
  3. Run at least 3 nodes
  4. Upgrade to v1.5 - only deploy a single v1.5 node with the other two remaining at v1.4
  5. See if the error occurs.

Expected behavior

Upgrade should occur without any errors.

Actual behavior

To be determined.

@Arkatufus
Copy link
Contributor

Arkatufus commented May 2, 2023

There is a big wire format incompatibility between 1.4 and 1.5 that was introduced in #5857

Akka.Cluster.Sharding.PersistentShardCoordinator+State class was removed and replaced with Akka.Cluster.Sharding.Internal.EventSourcedRememberEntitiesShardStore+State. What is worse is that the 2 classes are not compatible with each other, so it is impossible to adapt one message to another.

Akka.Cluster.Sharding.PersistentShardCoordinator+State class was removed and replaced with Akka.Cluster.Sharding.ShardCoordinator+CoordinatorState. We might be able to adapt the old messages to the new one, maybe introduce a backward compatible flag setting?

This class is serialized inside the Gossip DData envelope and failed to be deserialized.

@Aaronontheweb
Copy link
Member Author

@Arkatufus if we were going to try to patch this for users upgrading, without also breaking it again for users who are happily running on v1.5, is there a viable path for doing so?

@Arkatufus
Copy link
Contributor

We need to add a new State class inside PersistentShardCoordinator that have properties that matches the old State class. The State class inside EventSourcedRememberEntitiesShardStore had to stay for backward compatibility with 1.5.0 - 1.5.4.

@Arkatufus
Copy link
Contributor

We would use this new State class from now on to preserve backward compatibility with 1.4

@Aaronontheweb
Copy link
Member Author

We would use this new State class from now on to preserve backward compatibility with 1.4

I think that's worth considering - although we haven't had that many reports of this issue in the wild just yet (aside from the original one that prompted this issue.)

@Arkatufus
Copy link
Contributor

I'll do a local build test and see if this fixes things

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants