Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: New range types for Edition 2024 #3550

Merged
merged 10 commits into from May 20, 2024
Merged

Conversation

pitaj
Copy link
Contributor

@pitaj pitaj commented Dec 28, 2023

Change the range operators a..b, a.., and a..=b to resolve to new types ops::range::Range, ops::range::RangeFrom, and ops::range::RangeInclusive in Edition 2024. These new types will not implement Iterator, instead implementing Copy and IntoIterator.

Closes #2848

Pre-RFC discussion on IRLO

🖥️ Rendered

Analysis of impact on crates in the wild

Tracking:

@clarfonthey
Copy link
Contributor

I personally wouldn't call the old range types "legacy" ranges because, if the new ranges impl IntoIterator, then it makes sense for their IntoIter to be the old range types.

@VitWW
Copy link

VitWW commented Dec 28, 2023

I'm huge fan of backward compatibility.
As alternative I've propose to add additional types of ranges and suffixes to ranges.

Now: Range, RangeFrom, RangeTo, RangeFull, RangeInclusive, RangeToInclusive (all with Iterator).
We add also: IntoRange, IntoRangeFrom, IntoRangeTo, IntoRangeFull, IntoRangeInclusive, IntoRangeToInclusive (all with IntoIterator)

let range1 = 0..5;
//  range1 : Range<usize>

let range2 = 0..#rit 5;
//  range2 : Range<usize>

let range3 = 0..#rii 5;
//  range3 : IntoRange<usize>

let range4 : Range<usize> = 0..5;

let range5 : IntoRange<usize> = 0..5;

UPD: where #rit and #rii are "type" suffixes - Range with ITerator (#rit) and Range with IntoIterator (#rii). And default type is #rit

@SOF3

This comment was marked as resolved.

text/3550-new-range.md Outdated Show resolved Hide resolved
@ehuss ehuss added T-lang Relevant to the language team, which will review and decide on the RFC. A-edition-2024 Area: The 2024 edition labels Dec 28, 2023
@pitaj
Copy link
Contributor Author

pitaj commented Dec 29, 2023

I personally wouldn't call the old range types "legacy" ranges because, if the new ranges impl IntoIterator, then it makes sense for their IntoIter to be the old range types.

This is addressed in the RFC under Use legacy range types as the iterators for the new range types.

@pitaj pitaj changed the title New range types for Edition 2024 RFC: New range types for Edition 2024 Dec 29, 2023
@ahicks92
Copy link

I like this. I see one problem, however. Any crate which provides a container that can be sliced which is not updated will require &container[(a..b).into()].

This could be solved with impl<T: Index<LegacyRange>> Index for T, or by adding an implicit conversion. I think such is necessary in this case because, even if the user knows what's going on, .into() on indices is really ugly.

Maybe it's also worth a mention that--without some sort of blanket impl--all of stdlib is going to need to implement Index/IndexMut for both.

@programmerjake
Copy link
Member

I like this. I see one problem, however. Any crate which provides a container that can be sliced which is not updated will require &container[(a..b).into()].

This could be solved with impl<T: Index<LegacyRange>> Index for T, or by adding an implicit conversion.

I think it would have to be an implicit conversion since impl<T: ?Sized + Index<LegacyRange>> Index<Range> for T overlaps with existing impl<T: Index<I>, I> Index<I> for Wrapper<T>

@ahicks92
Copy link

What is Wrapper<T> in this context? Is that a reference to all containers, a specific internal type, or a public type I somehow don't know about?

@programmerjake
Copy link
Member

programmerjake commented Dec 29, 2023

What is Wrapper<T> in this context? Is that a reference to all containers, a specific internal type, or a public type I somehow don't know about?

just some user-defined type with a type parameter T, e.g.:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=3b971b6ac64759fe8c48d874d5442579

pub struct Wrapper<T>(T);

impl<T: Index<I>, I> Index<I> for Wrapper<T> {
    type Output = T::Output;
    fn index(&self, idx: I) -> &Self::Output {
        self.0.index(idx)
    }
}

@ahicks92
Copy link

@programmerjake
Lol yeah. I think the best part of this discussion is that I wrote one of those this week. You're right though.

Could the compiler use the fact that it is the compiler to use specialization here? I dunno what the state of that is these days but it might be better than an implicit conversion. I hate blanket impls in stdlib, but it's a close race with how much I hate implicit conversions.

@pitaj
Copy link
Contributor Author

pitaj commented Dec 30, 2023

@ahicks92 thanks for bringing this up, I hadn't considered the indexing case.

I think many common cases will be covered by SliceIndex, but there will be collections that will need updates to support the new range types.

I will update the migration and drawbacks sections to cover this.

I would not support adding a blanket impl, specialization, or implicit conversion even if they were possible. I think we should take a different approach:

  • stabilize the new types as soon as possible
  • allowing crates to add Index impls before the syntax switch

@ahicks92
Copy link

Well, I do a lot of mathematical code as it is, and even as Rust is today Rust is very verbose a lot of the time for such cases. That's not really a complaint about Rust as such, But I don't want to deal with:

c[(a..b).into()].copy_from_slice(c2[(a..b).into()])

Where c is some collection. I can't find a crate right now but ironically the example that came to mind is ndarray (which sadly uses a different type for slicing and thus isn't applicable) which made me realize that this is actually more general for the case wherein one might be trying to perform operations with ranges where the thing going on is a method call.

The worst practical case of slicing with this proposal would be &c[(a as usize..b as usize).into()], because often one must cast the endpoints of the range.

crates.io is pretty big and pretty full of unmaintained stuff. I think "let crates start migrating ASAP" is a good point but it doesn't answer what to do about anything older that's not maintained or anything that doesn't notice this is coming in time. I don't know how much of a problem that would be, and I'm not sure how one would quantify it either. But that might be a good thing to try to do if it can be done.

@pitaj
Copy link
Contributor Author

pitaj commented Dec 30, 2023

I will just mention that (a..b).into() probably won't work unless the only index type is Range - otherwise it will probably fail with a type inference error. We may want to consider adding an inherent method (to_legacy or something) to avoid that issue.

@pitaj
Copy link
Contributor Author

pitaj commented Dec 30, 2023

I can't find a crate right now but ironically the example that came to mind is ndarray (which sadly uses a different type for slicing and thus isn't applicable)

In std, the only affected impls would be:

  • impl ops::Index<ops::Range<usize>> for String (and the respective IndexMut)
  • impl ops::Index<ops::RangeFrom<usize>> for String (and the respective IndexMut)
  • impl ops::Index<ops::RangeInclusive<usize>> for String (and the respective IndexMut)
  • impl ops::Index<ops::RangeFrom<usize>> for CStr

The Index impls for String should probably transition to using SliceIndex like str does. BTreeMap, HashMap, and VecDeque don't support indexing by range at all, and CString and OsString only support RangeFull. BTreeMap::range uses RangeBounds, so no issue there.

So I'd say the only serious one there is CStr, which doesn't seem too bad. I am curious to see how many 3rd-party collections will be affected. I'm not sure how to go about researching that.

@ahicks92
Copy link

I am thinking about it like this:

  • This RFC wants to replace range types in large part because the new ones can be Copy, thus avoiding .clone().
  • But to do so, it instead exchanges .clone() for .into().

I understand that the other motivation is about having ranges inside other types, but leaving that aside the unstated premise here seems to me to be that we are claiming that working around lack of Copy today is more ubiquitous than needing to convert to legacy types in Rust 2024. I could believe that is true, my use cases aren't even close to all possible use cases, but not without proof.

I don't consider stdlib to be interesting. From a language perspective solutions up to and including hacking the compiler for stdlib backward compatibility are feasible. Solving it is important and interesting but the solution space is large enough there that there is certainly a solution and the question isn't "if", it's "which".

@pitaj
Copy link
Contributor Author

pitaj commented Dec 30, 2023

My thoughts:

  • We need actual data about this. Right now we don't know if this will be an actual issue in practice. The evaluation of std was only an example.
  • If the issue is widespread, then I agree we should consider doing something more involved.
  • On the other hand, updating to the new edition is entirely optional. You and others can choose to stay on 2021 if you feel conversion is overly burdensome.

@ahicks92
Copy link

I don't think we can. One either picks 2024 and converts for 2021 crates, or picks 2021 and converts for 2024 crates. Right? So if this is an issue you lose out either way. Also stay on the old edition doesn't work because eventually the new edition gets something you need.

But as you've already stated data is a good idea. I'm willing to argue contingent on this being widespread all day long but the entire thing depends on if it's widespread or not.

If we don't have one someone should really figure out how to run a query over all of crates.io, probably via docs.rs, to be like select * where a_range_type in signature.

@pitaj
Copy link
Contributor Author

pitaj commented Dec 31, 2023

I don't think we can. One either picks 2024 and converts for 2021 crates, or picks 2021 and converts for 2024 crates. Right?

Not really. For the indexing issue, they already support legacy range types and they can just add impls for the new types. Functions that take a specific range type can use impl Into<Range<_>> to support all editions, with no need for conversion on usage.

The only case where conversion is unavoidable is if you're using the new range types on code expecting old range types. All new code can and should be written to accept both.

Essentially, the issue is forwards compatibility, not backwards compatibility.

@ahicks92
Copy link

I doubt anyone is going to cover both. That's not a big ask but you then have to teach newbies to the language that due to legacy reasons they'd better.

As far as I know, every other edition has not required supporting the older edition in order to depend on crates supporting the newer one. Perhaps it has and I've missed it. But if not, this is establishing the precedent.

@traviscross
Copy link
Contributor

We discussed this in the lang triage call today, including with @m-ou-se who was present. The discussion probably defies short summary, but the general feeling was that we want this RFC to add an open item for whether the migration cost makes this doable (in this edition or in Rust 2027), and then, as long as we take seriously that open question, we could accept the RFC and take the next steps required to better understand that feasibility. Niko expressed this is a difficult question to answer in the abstract, that we may just need to see how bad it is, and that:

I'd like libs-api to stabilize these types because they're part of a long term plan on the lang side. Otherwise there's a chicken-and-egg problem here. So even if we're not sure this is Rust 2024 or Rust 2027, it should be reasonable for us to stabilize these types.

There was concern expressed about rushing in these types and not having time to get the details right before stabilization. Everyone seemed to agree these should not be rushed in that way.

@the8472
Copy link
Member

the8472 commented Apr 10, 2024

The Migrating proposes to use impl Into<...> as a fix for a function that takes Ranges by ownership. But I'm not seeing anything that deals with methods that take or return &mut Range.

@programmerjake
Copy link
Member

The Migrating proposes to use impl Into<...> as a fix for a function that takes Ranges by ownership. But I'm not seeing anything that deals with methods that take or return &mut Range.

for RangeFrom and Range, but not RangeInclusive, we can use repr shenanigans to guarantee the old and new ranges have the same layout so you can convert &OldRange to &NewRange and vis versa losslessly.

@m-ou-se
Copy link
Member

m-ou-se commented Apr 16, 2024

I think a way forward could be to add the new range types as unstable to the standard library, so we can start on the detailed design of those types and their API. I don't think those types should be stabilized before we know what will happen with the related lang changes, but adding an unstable std::range module for now to work on something more concrete seems like it would be useful at this point.

(And perhaps there are good arguments for adding these types regardless of what happens with .. syntax. If there are, that might be useful to put in a libs-api ACP.)

For this RFC, I'd be happy to resolve my rfcbot concern if an unresolved question is added on whether the proposed migration is not too disruptive, and how to properly document and execute the ecosystem migration, to make sure that is taken into consideration before stabilization.

@m-ou-se
Copy link
Member

m-ou-se commented Apr 16, 2024

We briefly discussed this in the libs-api meeting. We're happy to see a PR for adding std::range with the new range types as unstable. Feel free to open a libs-api tracking issue for the new range types.

@vallentin
Copy link
Contributor

implement IntoIterator directly (when Idx: Step)

If ranges are overhauled, should Step also be stabilized? Currently Ranges using a custom type is limited, given that Step is unstable. So Range<Foo> always require implementing a separate Iterator, e.g. struct FooRangeIter(Range<Foo>) with impl Iterator for FooRangeIter.

For instance chrono's iter_days() / NaiveDateDaysIterator could effectively simply be impl Step for NaiveDate.

Related: rust-lang/rust#42168

@Amanieu Amanieu removed the I-libs-api-nominated Indicates that an issue has been nominated for prioritizing at the next libs-api team meeting. label Apr 23, 2024
@nikomatsakis
Copy link
Contributor

For this RFC, I'd be happy to resolve my rfcbot concern if an unresolved question is added on whether the proposed migration is not too disruptive, and how to properly document and execute the ecosystem migration, to make sure that is taken into consideration before stabilization.

Let's do this then =)

@BurntSushi
Copy link
Member

BurntSushi commented Apr 24, 2024

A minor thing the RFC doesn't cover is a migration path for APIs that return legacy ranges. For example, regex::Match::range. It's probably pretty uncommon, so I expect the answer is either "issue a breaking change release moving to the new range type" or "introduce a new API returning the new range type, and eventually deprecate the legacy range API." But since both legacy and new ranges are named Range, coming up with a name for a new API to exist concurrently with the legacy API is awkward. I like Match::span personally, but it's non-ideal.

Anyway, I agree we should move forward with a caveat about backing this out if the disruption is too great.

@pitaj
Copy link
Contributor Author

pitaj commented Apr 29, 2024

Status update:

I was hoping to get back to this over the weekend but unfortunately didn't have time. I will try to get to it this week.

@m-ou-se
Copy link
Member

m-ou-se commented Apr 30, 2024

@rfcbot resolve No clean migration path.

@rfcbot rfcbot added final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. and removed proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. labels Apr 30, 2024
@rfcbot
Copy link
Collaborator

rfcbot commented Apr 30, 2024

🔔 This is now entering its final comment period, as per the review above. 🔔

@traviscross traviscross removed the I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. label May 1, 2024
@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this RFC. to-announce and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. labels May 10, 2024
@rfcbot
Copy link
Collaborator

rfcbot commented May 10, 2024

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

The FCP for RFC 3550 completed.  Let's prepare it to be merged.
@traviscross traviscross merged commit 73dde93 into rust-lang:master May 20, 2024
@traviscross
Copy link
Contributor

The teams have accepted this RFC and it has now been merged.

Thanks to @pitaj for pushing this work forward, and thanks to all those who reviewed and contributed helpful feedback.

For further updates, follow the tracking issue:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-edition-2024 Area: The 2024 edition disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this RFC. I-types-nominated Indicates that an issue has been nominated for prioritizing at the next types team meeting. T-lang Relevant to the language team, which will review and decide on the RFC. T-libs-api Relevant to the library API team, which will review and decide on the RFC. to-announce
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Impl Copy for Range