- Sponsor
-
Notifications
You must be signed in to change notification settings - Fork 854
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Migrate from CRF++ to Ingredient Parser (a Python package) #5061
Merged
Kuchenpirat
merged 11 commits into
mealie-recipes:mealie-next
from
michael-genson:feat/upgrade-nlp-parser
Feb 28, 2025
Merged
feat: Migrate from CRF++ to Ingredient Parser (a Python package) #5061
Kuchenpirat
merged 11 commits into
mealie-recipes:mealie-next
from
michael-genson:feat/upgrade-nlp-parser
Feb 28, 2025
+368
−593
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading status checks…
…g tests now I guess??
Loading status checks…
…-nlp-parser
6 tasks
Kuchenpirat
reviewed
Feb 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just the one comment.
Kuchenpirat
approved these changes
Feb 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
Might have to see if i can add some german to the package :)
6 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
(REQUIRED)
Replaces the CRF++ dependency with ingredient-parser (a Python library). Inspired largely from @hay-kot's recipes-api which is getting used for recipinned.
While the results are pretty comparable to the old CRF++ library, this gives several added benefits:
Which issue(s) this PR fixes:
(REQUIRED)
Fixes #4960
Fixes #4993
Fixes #5121
Special notes for your reviewer:
(fill-in or delete this section)
Not everything is 1:1 (e.g. "ground black pepper" is parsed as a food rather than "black pepper" as the food and "ground" as the note), but it seems to be about 50/50 when it comes to improvements vs regressions (and they're minor). At least now we can easily add some more tests since it works in the dev container without a hassle.
This package theoretically supports additional languages, but it doesn't work out of the box. Something we can look into in the future. For now it really only works with English (same as the old parser).
Also had to fix a totally unrelated issue regarding stale data because it's making tests fail. No clue why this is happening now and wasn't happening before
Testing
(fill-in or delete this section)
Threw lots of examples at it and got comparable outputs vs the old parser.