-
Notifications
You must be signed in to change notification settings - Fork 515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fs.gs.inputstream.fadvise defaults to SEQUENTIAL #5132
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #5132 +/- ##
=======================================
Coverage 63.34% 63.35%
=======================================
Files 291 291
Lines 10840 10840
Branches 753 753
=======================================
+ Hits 6867 6868 +1
+ Misses 3973 3972 -1 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool! IIRC, at the time our understanding was that random access was optimal for Parquet specifically (rather than for MapReduce jobs generally) -- in order to construct a multi-field projection from columnar storage you'd be jumping around quite a lot in the row group to fetch all the required fields.
Out of curiosity, I did a quick Dataflow benchmark using different projection %s/fadvise settings:
Projection Size | Fadvise | VCPU Hours |
---|---|---|
2% | Random | 45.846 vCPU hr |
2% | Sequential | 28.45 vCPU hr |
100% | Random | 47.814 vCPU hr |
100% | Sequential | 40.909 vCPU hr |
So it looks like Sequential helps a LOT with partial projection (which require less random seeking required to construct record classes), and slightly less with full schema projections (but still better than Random!). Nice!
(note: I didn't test SplittableDoFn reads where the file has multiple row groups -- that might be a case where Random wins -- maybe we can note that in the docs?)
Thanks for the tests. FYI all my tests were based on SplittableDoFn. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks build is failing making the site because the PR is triggered from fork. Ok to merge
@clairemcginty @RustedBones what do you guys think about |
I tested out AUTO in my quick benchmark yesterday and it seemed to be slightly worse for partial projections (31.3 VCPU hours compared to SEQUENTIAL's 28.4) and much worse for full projections (it got to 160.9 vCPU hr compared to SEQUENTIAL's 40.9 before I gave up and killed the job...) |
wow that doesn't sound good at all! Was it worst than |
AUTO outperformed RANDOM for partial projection, but was much worse for full projection |
Based on this article linked in our docs, Scio jobs should have
fs.gs.inputstream.fadvise
set toSEQUENTIAL
:This makes sense as the batch jobs tend to read the records sequentially as opposed to random look ups, and Scio jobs fit that category very well. So to me seems like the documentation has been misinterpreted when the default was set to
RANDOM
.Furthermore in my benchmarks I always saw a slight better performance for fadvise=SEQUENTIAL and rarely the opposite.