docs: fixup docs of `tree_fold1` #787

RobWalt · 2023-10-20T19:14:14Z

I was just reading through the docs when I noticed something odd. It's probably just a typo:

There is a recommendation to use reduce instead of tree_fold1 when f is associative. Since tree_fold1 won't really work as expected (unless you know what you're doing) in non-associative cases, this would imply that we don't recommend the use of tree_fold1 at all?

I just took a guess and adjusted that part of the docs to what it most likely was supposed to be. Feel free to close this PR if I'm completely wrong here 🙈

Philippe-Cholet · 2023-10-20T19:24:25Z

@scottmcm I believe you wrote Itertools::tree_fold1, can you shed some light here?

scottmcm · 2023-10-20T19:43:20Z

Since tree_fold1 won't really work as expected (unless you know what you're doing) in non-associative cases

I would say the different behaviour for non-associative cases is entirely the point of tree_fold1 (and of rfold, for that matter).

If you have something truely associative, like u32::wrapping_add, then there's no point in running the materially-more-complicated tree_fold1 since .reduce(u32::wrapping_add) will do the same thing and probably do it much faster.

But even with something only slightly non-associative like f32::add, that's where the good of the non-linear folding structure comes in. Simple demo: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=b6a1c419176ec7b02fb6af3539708da0

See also the string format bench (

itertools/benches/tree_fold1.rs

Line 57 in 7a6c1ef

pub fn string_format(c: &mut Criterion) {

) for a place where the non-associativity of performance of the operation really matters.

RobWalt · 2023-10-21T08:37:58Z

Ok yeah, I see what you mean. Thanks for the explanation. I think I missed some things before. Let's take a closer look at all the possible cases:

associative:
- f is simple: e.g. cases like u32::wrapping_add. I guess you wanted to say that this case is so simple that it's better to use the more simplistic reduce since it's probably better for optimizations, code gen etc, right? Some evidence that this is true would be cool
  - I can try to benchmark this a little to shine some light on it
- f is complex, e.g. the format case. Here we recommend the use of tree_fold1 since it reduces the amount of required operations from n to ln2(n). I thought this is the main use case of tree_fold1 and this is where my confusions and "improvements" of the docs come from
  - As mentioned, this is imo the main use case and should be highlighted. Writing
    
    If f is associative, prefer the normal [Iterator::reduce] instead.
    
    seems misleading.
non-associative:
- in general: e.g. the i32::sub case from the docs. tree_fold1 will most likely produce different results than reduce in an "unintuitive" way and this is why it shouldn't be recommended in this case
- in special cases: e.g. your example with f32::add. If you really know what you're doing and if you can estimate the data you're dealing with then it might make sense that you use it in this case. I think your example is a bit constructed in a way since the result of the f32::add operations also depends on the order of the elements in the array and not only on the folding function used. So yeah, in a way tree_fold1 can produce a more exact result, but you need to ensure that the data is ordered so that it works out. Take this example where we shuffle the array first https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=31ea48fe708bf3778358c48a930963b5

Both of the non-associative cases should be neglacted in my opinion.

scottmcm · 2023-10-22T05:57:42Z

One particular place where this can be really useful is in building binary trees. Compare the outputs in https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=f6a580a37e0a01b1c94767680dd444ca -- using tree_fold1 gives an O(log n)-depth tree vs the O(n)-depth tree from reduce, and the lower depth tends to be better, since it means things like a recursive DFS will be much happier.

I don't know a great way to fit that example into a doc comment, though :/

RobWalt · 2023-10-23T06:23:06Z

Just did a very basic test on compiler explorer and both reduce and tree_fold1 generate the exact same machine code for the simple case of u32::add as can be seen here, which implies that there is no benefit of using one over the other.

To see this you need to scroll down all the way on the right hand side window. The generated assembly code is color coded the same way as the related rust code.

https://rust.godbolt.org/z/b7K5EenEE

scottmcm · 2023-10-23T07:33:48Z

Never use + over a Range as your test, because when LLVM notices that's what you're doing it optimizes it to the closed-form solution.

I suggest comparing summing up a slice, where you'll see, for example, that the normal reduce gets vectorized but the tree_fold does not: https://rust.godbolt.org/z/6hb13ae4T

They're very very different things once you're not doing something that's not just a constant.

RobWalt · 2023-10-23T07:52:16Z

I guess my understanding of all of this is just lacking then. I'll just close the issue. Thanks a lot for the extra explanation, that was helpful! And sorry for the needless discussion

scottmcm · 2023-10-23T07:53:50Z

No, the discussion is good! It'd be great if you could find a nice way to reflect this difference into the documentation, to help the next person who's wondering why the method exists.

RobWalt · 2023-10-24T06:26:12Z

Thanks for the positive feedback :) Ok, then let me know which parts I should put the highlights on! The different use cases of tree_fold1 and reduce based on the complexity of f?

RobWalt · 2023-10-25T19:03:38Z

@scottmcm I adjusted the docs once again. Does this come closer to an acceptable explanation? 😅

RobWalt · 2023-11-01T15:41:43Z

bump to not forget this issue as it's also not that big of an issue 😅

TomFryersMidsummer · 2024-01-12T15:00:49Z

Some of this doesn't seem right to me, and the docs now seem a bit confusing, if not inaccurate. Specifically:

if f is non-trivial like format!, you should use tree_fold1 since it reduces the number of operations from O(n) to O(ln(n))

Both reduce and tree_fold1 call f n − 1 times, which is O(n).

I think what this is trying to get at is that if f is O(n), the iterator elements are of similar sizes*, and size(f(x, y)) = size(x) + size(y), then reduce(f) is O(n²) and tree_fold1 is O(n log n).

format! does satisfy these condition, but it may not be the best example. format!("{x}{y}") and format!("{x}foo{y}") are the only obvious non-trivial associative cases, and these are concat and join.

* Not necessary, but sufficient. reduce gets faster if you put the largest items at the end.

RobWalt · 2024-01-16T08:37:06Z

I'm sorry if I made the docs less correct. I can look into this when I find the time again. Otherwise feel free to open a PR yourself

chore: fixup docs of tree_fold1

6f8986c

RobWalt closed this Oct 23, 2023

RobWalt reopened this Oct 24, 2023

chore: adjust docs to reflect discussion in the PR

ad590eb

jswrenn approved these changes Nov 13, 2023

View reviewed changes

jswrenn added this pull request to the merge queue Nov 13, 2023

jswrenn added this to the next milestone Nov 13, 2023

Merged via the queue into rust-itertools:master with commit b76172b Nov 13, 2023
8 checks passed

RobWalt deleted the chore/improve-docs branch November 13, 2023 21:16

jswrenn mentioned this pull request Nov 14, 2023

prepare v0.12.0 release #802

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: fixup docs of `tree_fold1` #787

docs: fixup docs of `tree_fold1` #787

RobWalt commented Oct 20, 2023

Philippe-Cholet commented Oct 20, 2023

scottmcm commented Oct 20, 2023

RobWalt commented Oct 21, 2023

scottmcm commented Oct 22, 2023

RobWalt commented Oct 23, 2023 •

edited

scottmcm commented Oct 23, 2023

RobWalt commented Oct 23, 2023

scottmcm commented Oct 23, 2023

RobWalt commented Oct 24, 2023

RobWalt commented Oct 25, 2023

RobWalt commented Nov 1, 2023

TomFryersMidsummer commented Jan 12, 2024

RobWalt commented Jan 16, 2024

docs: fixup docs of tree_fold1 #787

docs: fixup docs of tree_fold1 #787

Conversation

RobWalt commented Oct 20, 2023

Philippe-Cholet commented Oct 20, 2023

scottmcm commented Oct 20, 2023

RobWalt commented Oct 21, 2023

scottmcm commented Oct 22, 2023

RobWalt commented Oct 23, 2023 • edited

scottmcm commented Oct 23, 2023

RobWalt commented Oct 23, 2023

scottmcm commented Oct 23, 2023

RobWalt commented Oct 24, 2023

RobWalt commented Oct 25, 2023

RobWalt commented Nov 1, 2023

TomFryersMidsummer commented Jan 12, 2024

RobWalt commented Jan 16, 2024

docs: fixup docs of `tree_fold1` #787

docs: fixup docs of `tree_fold1` #787

RobWalt commented Oct 23, 2023 •

edited