-
Notifications
You must be signed in to change notification settings - Fork 319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add {min,max}_set(_by{_key)?)? functions #323
Conversation
0eb8bf3
to
65444a9
Compare
Just wondering if there is any interest in this feature? |
65444a9
to
feea621
Compare
I would find this feature useful, although wrapping in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The functionality sounds great! I appreciate that you already considered min/max and by/by_key.
Overall, I think we could try to be a bit simpler and generic in some places.
|
||
for element in it { | ||
let key = key_for(&element); | ||
if lt(&element, &result[0], &key, ¤t_key) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to replace lt
(returning bool
) by cmp
(returning Ordering
) so that we could avoid the second call to lt
in line 23?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should work nicely.
As it is, I essentially copied the structure from the minimax
set of functions, where I see now that multiple comparisons are also made.
I'm happy to make the change, but will probably not have the time to do so in the next couple of weeks (currently on vacation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. That didn't occur to me when I looked at minmax
last time. I could, however, imagine that things are a bit more complicated there because it needs to look for minimum and maximum simultaneously.
Maybe I should go back and have a look at minmax
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a look at minmax
and think that it needs to do more stuff since it searches for both the minimum and the maximum. Here, I think we can use less calls to compare elements.
/// Implementation guts for `min_set`, `min_set_by`, and `min_set_by_key`. | ||
pub fn min_set_impl<I, K, F, L>(mut it: I, | ||
mut key_for: F, | ||
mut lt: L) -> Option<Vec<I::Item>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about returning an empty Vec
instead of None
? I think it would make things a bit more clear, especially since the types do not indicate that a Some
always contains a non-empty vector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we could/should be a bit more generic than Vec
here? I.e. should we try to be like collect
(which returns something FromIterator<Self::Item>
and lets the caller choose)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally think that returning an Option is a good choice. In application code that I write, when using max or min functionality of sequences, an empty sequence is often an indication that something has gone awry and should be forced to be handled in another way. I imagine that in most cases, the Option-ness can be handled by the ?
-operator. Also, it matches the signature for max/min from the standard library, which returns an Option
of an Item
. I can naturally change it, but I do think that Option makes sense here.
As for why Vec and not some kind of iterator. The number of results is unknown before computing the result, and the result is unknown before going through the whole input iterator, which implies that some kind of temporary storage area is needed. Since the data is already in a Vec, I thought that the least convoluted thing was to just give the user back that. We can of course wrap the already constructed Vec
in something that exposes it like a FromIterator
, I have no real opinion here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the Vec
: I naively thought about substituting the whole Vec
even within the implementation (basically allowing to customize the kind of temporary storage), so that no conversion would be required upon returning it.
However, I see that FromIterator
alone is possibly a bit too narrow to acchieve that. Maybe any container satisfying Default + Extend
or FromIterator + Extend
could be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the code by neccesity (unless double iteration is used) provisionally accumulates an unbounded number of result items, and discards these when a new provisionally best results is found, something more than FromIterator + Extend
would be needed AFAIK.
My guess would be that anything that also has the possibility to clear the results is no better than the interal Vec
, and would also be less clear. I'd be glad to learn I'm wrong though!
/// assert_eq!(a.iter().min_set(), Some(vec![&1, &1, &1, &1])); | ||
/// ``` | ||
/// | ||
/// The elements can be floats but no particular result is guaranteed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I often run into situations where I like to simply compare floats. But most of the time, I am happy that Rust tells me that f64
is not Ord
.
Iterator::min
also requires Ord
. Do you think we should stick to that requirement and require the caller to resolve the float issues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, floats not being Ord
is annoying, but also great.
In this case, I again copied the style and wording from minmax
.
See my reply here: #323 (comment) |
The function min_set returns a Vec of all the minimum values. All variants for max and with key extraction and comparison functions are added also. Since the functions need to return an unknown number of values and the values are not known until the iterator is finished, the function needs to allocate memory. Therefore Vec is used for returning the values.
feea621
to
7a7d8c1
Compare
@zayenz Do you want to drive this to completion? (No worries if your answer is "no".) If so, could you:
|
@phimuemue I'd forgotten about this one. Yes, I would like to fix this as I think the methods are quite useful. I will take a look at updating the code, and then seeing about the changes mid next week. |
613: Minmax set (prev "Add {min,max}_set(_by{_key)?)? functions") r=jswrenn a=phimuemue This PR supersedes #323 (I did not know how I could amend to the original PR, so sorry for the "duplicate" PR here.) The original PR by `@zayenz` looked rather good. I adjusted the stuff that came up during discussion back then: * Teturn type `Vec` instead of `Option` - emptiness is sufficiently well represented by `Vec`. * Functions require `Ord` instead of `PartialOrd` - just as `Iterator::min`. * Avoid duplicate calls to `lt` by accepting a `FnMut(...)->Ordering` - seems canonical compared to the `bool`-solution. * Use internal iteration instead of a manual `for`-loop. Moreover, I simplified some bits. Co-authored-by: Mikael Zayenz Lagerkvist <zayenz@gmail.com> Co-authored-by: philipp <descpl@yahoo.de>
The function min_set returns a Vec of all the minimum values. All variants for max and with key extraction and comparison functions are added also.
Since the functions need to return an unknown number of values and the values are not known until the iterator is finished, the function needs to allocate memory. Therefore Vec is used for returning the values. From the guidelines for itertools, this also makes it unlikely to be included in the standard library.
The reason for adding these functions is that I often find situations where I want to know all the minimum values.
The naming is not optimal;
min_set
indicates that a set is returned while aVec
is returned. Another name isall_min
, but I think that is less explanatory and not as discoverable. YMMV.While an empty
Vec
could be used to handle the case of an empty iterator, I think that it is more safe to push that check into the type-system.