-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix WebDatasets KeyError for user-defined Features when a field is missing in an example #7004
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great ! I added a small suggestion to handle the case where the missing field is an image or an audio sample.
Could you also add a small test in test_webdataset.py
?
I believe you can reuse the same code as in test_image_webdataset
and simply pass features=
to Webdataset()
and check if the examples have the extra features as None. Then we'll be good to merge !
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great ! I updated the test :) the rest of the CI failures are unrelated to this PR
Show benchmarksPyArrow==8.0.0 Show updated benchmarks!Benchmark: benchmark_array_xd.json
Benchmark: benchmark_getitem_100B.json
Benchmark: benchmark_indices_mapping.json
Benchmark: benchmark_iterating.json
Benchmark: benchmark_map_filter.json
Show updated benchmarks!Benchmark: benchmark_array_xd.json
Benchmark: benchmark_getitem_100B.json
Benchmark: benchmark_indices_mapping.json
Benchmark: benchmark_iterating.json
Benchmark: benchmark_map_filter.json
|
@lhoestq Thank you! |
…ssing in an example (#7004) * Fix KeyError bug * Add additional check Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com> * Add test for missing key handling * update test --------- Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com> Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>
…ssing in an example (#7004) * Fix KeyError bug * Add additional check Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com> * Add test for missing key handling * update test --------- Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com> Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>
Fixes: #6900
Not sure if this needs any addition stuff before merging