Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[grouping] Consider storing metadata on GroupHash records #70454

Open
lobsterkatie opened this issue May 7, 2024 · 0 comments
Open

[grouping] Consider storing metadata on GroupHash records #70454

lobsterkatie opened this issue May 7, 2024 · 0 comments

Comments

@lobsterkatie
Copy link
Member

lobsterkatie commented May 7, 2024

It's come up a couple times that it'd be great to have more information about how a particular GroupHash record came to be and/or came to be in its current state.

  • Grouping config version (would have to decide if it's first version to calculate that hash value or latest version)
  • If Seer is involved, the fact that the hash is represented in the Seer DB, Seer model version, parent hash (if any) - useful for display in the Grouping Info section of the issue details page, and for debugging the grouping of a given event.
  • If the issue's been merged, maybe the id the Activity record of the merge so we can get more data from that if desired
  • Possibly the grouping enhancements currently stored on each event (so that they don't have to be), maybe along with a dump of the stacktrace which was enhanced, and/or the full grouping info variants data (so we can know which hashes come from message vs stacktrace vs whatever)
  • Other things?
lobsterkatie added a commit that referenced this issue May 17, 2024
This uses the helpers added in #70999 to - depending on the state of the `projects:similarity-embeddings-metadata` and `projects:similarity-embeddings-grouping` flags - decide whether we should call Seer before creating a new group, make the API call if so, and then store the results and/or use them to actually prevent new group creation in favor of using an existing similar issue. The behavior is as follows:


| metadata  | grouping | call  | metadata in | metadata in | use Seer-matched |
|   flag    |  flag    | Seer? |   event?    |   group?    |  group, if any?  |
|-----------|----------|-------|-------------|-------------|------------------|
| off       | off      | no    | -           | -           | -                |
| on        | off      | yes   | yes *       | yes         | no               |
| on or off | on       | yes   | yes *       | only if new | yes              |

* For now, the only event with the data will be the event which triggers the Seer 
call, not subsequent events with that hash. In the long run we will probably need
to store the data on the `GroupHash` record itself. 
See #70454.


This should be enough for us to run a POC on S4S and measure the effect on grouping.
cmanallen pushed a commit that referenced this issue May 21, 2024
This uses the helpers added in #70999 to - depending on the state of the `projects:similarity-embeddings-metadata` and `projects:similarity-embeddings-grouping` flags - decide whether we should call Seer before creating a new group, make the API call if so, and then store the results and/or use them to actually prevent new group creation in favor of using an existing similar issue. The behavior is as follows:


| metadata  | grouping | call  | metadata in | metadata in | use Seer-matched |
|   flag    |  flag    | Seer? |   event?    |   group?    |  group, if any?  |
|-----------|----------|-------|-------------|-------------|------------------|
| off       | off      | no    | -           | -           | -                |
| on        | off      | yes   | yes *       | yes         | no               |
| on or off | on       | yes   | yes *       | only if new | yes              |

* For now, the only event with the data will be the event which triggers the Seer 
call, not subsequent events with that hash. In the long run we will probably need
to store the data on the `GroupHash` record itself. 
See #70454.


This should be enough for us to run a POC on S4S and measure the effect on grouping.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant