Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MetaFieldRanker: allow different options for what to do with missing metadata field #7691

Open
robpasternak opened this issue May 13, 2024 · 0 comments · May be fixed by #7700
Open

MetaFieldRanker: allow different options for what to do with missing metadata field #7691

robpasternak opened this issue May 13, 2024 · 0 comments · May be fixed by #7700
Labels
2.x Related to Haystack v2.0 topic:metadata type:feature New feature or request

Comments

@robpasternak
Copy link
Member

Is your feature request related to a problem? Please describe.
This is not an immediately faced problem, but seems like something that would be undesired in the long run. Long story short, currently when MetaFieldRanker sorts by the assigned metadata field, it puts any documents that are missing that metadata field at the bottom of that sort (assuming there is at least one doc that does have that metadata field). So e.g. if sorting by a date metadata field in descending order, docs without that metadata field are treated as chronologically prior to all docs with that metadata field. This may not be the desired way to handle this situation in all use cases.

Describe the solution you'd like
I think it'd make sense to have a parameter like missing_meta: Literal["bottom", "top", "drop"] = "bottom", which determines what to do with documents that are missing the sort metadata field.

  • "bottom" (default): documents with missing metadata field are sorted at bottom when sorting for that metadata field (currently-exhibited behavior).
  • "top": documents with missing metadata field are sorted at top when sorting for that metadata field.
  • "drop": documents with missing metadata field are dropped entirely.

I'm certainly open to alternatives, but these seemed to me to be the three most obvious options.

Describe alternatives you've considered
Haven't seriously considered other options. The next best thing as far as I can tell would be to introduce an additional component that does some filtering or re-ranking, so that way MetaFieldRanker stays intact but these alternative options are achievable. This seems to me to be a worse option, however.

Additional context
I'd love to take this on myself.

@robpasternak robpasternak added type:feature New feature or request topic:metadata 2.x Related to Haystack v2.0 labels May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x Related to Haystack v2.0 topic:metadata type:feature New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant