ctor/dtor to be made always unsafe in 1.0 #159

mmastrac · 2021-09-01T21:10:54Z

This library requires you to know what you're doing, and making ctor/dtor unsafe is the right way to go. Most users are probably already using unsafe anyways, as this is often used to interface with C code.

The text was updated successfully, but these errors were encountered:

asomers · 2022-12-29T20:31:59Z

I disagree. unsafe doesn't mean "this could have bugs". It only means a few specific things, like "this could access invalid memory" or "this is prone to data races". Unless running before main inherently creates a risk of accessing invalid global variables or something like that, it shouldn't require unsafe.

notgull · 2023-03-18T03:27:31Z

Unless running before main inherently creates a risk of accessing invalid global variables or something like that, it shouldn't require unsafe.

Can't it cause this, in some environments? My knowledge of libstd is rusty at best, but I think that there are at least some global variables that can't be accessed early like this. In addition, I don't think libstd should accommodate the use case where stuff happens before main().

asomers · 2023-03-20T15:49:55Z

Can't it cause this, in some environments? My knowledge of libstd is rusty at best, but I think that there are at least some global variables that can't be accessed early like this.

Maybe? If so that would be a good argument for being unsafe. But I don't think you should assume such a thing without finding any specific examples.

mmastrac · 2023-03-26T16:59:30Z

The big issue is that something as simple and fundamental as println can cause UB, as there is no guarantee that Rust has correctly initialized any part of std by the time we're up and running.

I've been pondering whether it is possible to allow a reduced subset of code that can run without unsafe, but most uses of ctor are just calling extern "C" functions.

asomers · 2023-03-26T17:32:10Z

So what would be the safety advice to the user? "Don't use anything from the standard library?" I notice that some of the examples in the README do access the standard library.

oscartbeaumont · 2023-09-10T16:41:00Z

It's also worth noting if you stack overflow in ctor you will get a segmentation fault. I feel like being able to cause a segmentation fault in purely safe Rust is expressly against what I understand Rust's unsafety rules to say.

#[ctor::ctor]
fn foo() {
    demo();
}

fn demo() {
    demo();
}

Kixunil · 2023-10-25T08:40:57Z

@oscartbeaumont segmentation fault is not the same thing as UB. Programs do use segfault to avoid UB. So your demo could be the protection working as intended but I have no idea if it actually is.

Anyway, I think assuming unsafe is better since there really aren't any guarantees.

SteveLauC · 2023-12-08T12:48:32Z

So what would be the safety advice to the user?

Would be really nice if we could have safety advice, I plan to replace Lazy/lazy_static with this crate to avoid runtime check, but I got memory leak (though memory leak does not mean memory-unsafe)

simonask · 2024-03-06T11:17:00Z

I second the desire to add the requirement that functions annotated with #[ctor] must be unsafe.

The point is not that the function itself is inherently unsafe, but that a library may want to perform initialization within #[ctor] functions that must run for other safe abstractions inside main() to be sound. But because the order of ctors cannot be globally guaranteed, such abstractions would be unsafe to use in other ctors.

Hence the soundness invariant of any ctor function is at minimum that it doesn't rely on safe abstractions that require another ctor to have run. This is in line with the philosophy that "nothing happens before and after main()", in the sense that it would be nice to be able to say that anything that does happen before main() may be a prerequisite for the soundness of code inside main(). The standard library seems to be making at least somewhat similar assumptions.

This invariant would be very, very useful in conjunction with crates such as linkme.

Use case

My use case is a string interning library, where interned string "literals" are frequently present in the code. I need to guarantee that all identical strings in the program are unified before the user sees them. Without the above invariant, this is not possible to achieve without some runtime check or indirection at the point-of-use.

Ideally, I would like runtime use (i.e. within main()) to be a single load of a particular location in a linkme distributed slice, without any branches at all, or even atomics.

Example to illustrate the general idea, with many details omitted:

#[linkme::distributed_slice]
static LOCATIONS: [UnsafeCell<&'static str>] = [..];

#[ctor]
unsafe fn unify() {
    // MUST RUN BEFORE ANY CALL TO sym!() IS REACHED!
    for location in LOCATIONS {
        // Unify duplicate strings in-place.
    }
}

macro_rules! sym {
    ($string:literal) => {
         #[linkme::distributed_slice(LOCATIONS)]
         static LOCATION: UnsafeCell<&'static str> = UnsafeCell::new($string);
         unsafe {
               // MUST RUN AFTER unify()!
               *LOCATION.get()
         }
    };
}

Currently I'm solving the problem without requiring a #[ctor], and the fastest possible solution requires an indirect function call with an initial trampoline at the point of use. This is more than fast enough, but it isn't the theoretically fastest possible solution, because there is no way to introduce the invariant that calls to sym!() must not occur in other ctors.

I realize that the ctor crate is not able to guarantee that static constructors installed by other means (like linking to a C++ library) uphold the same unsafety requirement, but I would think the above argument applies to any solution that adds static constructors to Rust. They become much more useful if we're allowed to rely on them for soundness in main(), at the cost of not having that soundness in other ctors.

mmastrac mentioned this issue Sep 1, 2021

#[ctor] on static unsound in presence of threads #95

Closed

mmastrac changed the title ~~ctor/dtor to be made always unsafe in a 1.0~~ ctor/dtor to be made always unsafe in 1.0 Sep 1, 2021

mmastrac mentioned this issue Sep 1, 2021

1.0 release plan #160

Open

3 tasks

mmastrac added the 1.0 label Sep 1, 2021

mmastrac mentioned this issue Sep 1, 2021

Using atexit is unsound when loaded via dlopen/LoadLibrary #94

Closed

mitsuhiko mentioned this issue Jan 22, 2022

Remove syn dependency #124

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ctor/dtor to be made always unsafe in 1.0 #159

ctor/dtor to be made always unsafe in 1.0 #159

mmastrac commented Sep 1, 2021

asomers commented Dec 29, 2022

notgull commented Mar 18, 2023

asomers commented Mar 20, 2023

mmastrac commented Mar 26, 2023

asomers commented Mar 26, 2023

oscartbeaumont commented Sep 10, 2023

Kixunil commented Oct 25, 2023

SteveLauC commented Dec 8, 2023

simonask commented Mar 6, 2024

ctor/dtor to be made always unsafe in 1.0 #159

ctor/dtor to be made always unsafe in 1.0 #159

Comments

mmastrac commented Sep 1, 2021

asomers commented Dec 29, 2022

notgull commented Mar 18, 2023

asomers commented Mar 20, 2023

mmastrac commented Mar 26, 2023

asomers commented Mar 26, 2023

oscartbeaumont commented Sep 10, 2023

Kixunil commented Oct 25, 2023

SteveLauC commented Dec 8, 2023

simonask commented Mar 6, 2024

Use case