Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UB in the batch_semaphore linked list #3399

Open
Darksonn opened this issue Jan 10, 2021 · 10 comments
Open

UB in the batch_semaphore linked list #3399

Darksonn opened this issue Jan 10, 2021 · 10 comments
Labels
A-tokio Area: The main tokio crate C-bug Category: This is a bug. I-unsound 💥 A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness M-sync Module: tokio/sync

Comments

@Darksonn
Copy link
Contributor

The following code has UB:

use futures::future::FutureExt;

fn main() {
    let rt = tokio::runtime::Builder::new_current_thread()
        .build()
        .unwrap();

    let m = tokio::sync::Mutex::new("foo".to_string());

    let lock = rt.block_on(m.lock());
    let l2 = m.lock();
    tokio::pin!(l2);

    (&mut l2).now_or_never();

    drop(lock);
    rt.block_on(m.lock());
}

To see this, run it with:

MIRIFLAGS="-Zmiri-track-raw-pointers" cargo +nightly miri run

The problem was discovered by @kprotty on the discord server.

thread 1:

  • poll() creates a Pin<&mut Waiter> with the intrusive node
  • it sees theres no permits, grabs the queue lock, and pushes its intrusive node into the queue
  • poll() returns
    thread 1:
  • poll() is getting called again (maybe from select!()) and creates a Pin<&mut Waiter>
  • thread 1 gets preempted
    thread 2:
  • another thread releases some permits and wants to wake up a Waiter in the queue
  • it first grabs the queue lock, then pops thread 1's Waiter node from the queue
  • in order to pop the Node, it creates a &Node reference to it to deref some linked-list fields: https://docs.rs/tokio/1.0.1/src/tokio/util/linked_list.rs.html#114

there now exists at the same time:

  • thread 1: a Pin<&mut Waiter> which transitively implies a &mut Node in the Waiter
  • thread 2: a &Node when trying to pop it

The solution probably looks something like putting the Waiter in Acquire inside an UnsafeCell and only using raw pointers to access it, instead of mutable references.

Click to see miri failure
error: Undefined Behavior: trying to reborrow for SharedReadWrite at alloc1987, but parent tag <14056> does not have an appropriate item in the borrow stack
   --> /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/non_null.rs:212:18
    |
212 |         unsafe { &*self.as_ptr() }
    |                  ^^^^^^^^^^^^^^^ trying to reborrow for SharedReadWrite at alloc1987, but parent tag <14056> does not have an appropriate item in the borrow stack
    |
    = help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still experimental
    = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
            
    = note: inside `std::ptr::NonNull::<alloc::sync::ArcInner<tokio::runtime::basic_scheduler::Shared>>::as_ref` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/non_null.rs:212:18
    = note: inside `std::sync::Arc::<tokio::runtime::basic_scheduler::Shared>::inner` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/sync.rs:1034:18
    = note: inside `<std::sync::Arc<tokio::runtime::basic_scheduler::Shared> as std::clone::Clone>::clone` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/sync.rs:1295:24
    = note: inside `<std::mem::ManuallyDrop<std::sync::Arc<tokio::runtime::basic_scheduler::Shared>> as std::clone::Clone>::clone` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/mem/manually_drop.rs:50:5
    = note: inside `tokio::util::wake::inc_ref_count::<tokio::runtime::basic_scheduler::Shared>` at /home/alice/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.0.1/src/util/wake.rs:57:38
    = note: inside `tokio::util::wake::clone_arc_raw::<tokio::runtime::basic_scheduler::Shared>` at /home/alice/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.0.1/src/util/wake.rs:65:5
    = note: inside `<std::task::Waker as std::clone::Clone>::clone` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/task/wake.rs:268:29
    = note: inside closure at /home/alice/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.0.1/src/sync/batch_semaphore.rs:379:31
    = note: inside `tokio::loom::std::unsafe_cell::UnsafeCell::<std::option::Option<std::task::Waker>>::with_mut::<(), [closure@tokio::sync::batch_semaphore::Semaphore::poll_acquire::{closure#0}]>` at /home/alice/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.0.1/src/loom/std/unsafe_cell.rs:14:9
    = note: inside `tokio::sync::batch_semaphore::Semaphore::poll_acquire` at /home/alice/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.0.1/src/sync/batch_semaphore.rs:370:9
    = note: inside `<tokio::sync::batch_semaphore::Acquire as futures::Future>::poll` at /home/alice/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.0.1/src/sync/batch_semaphore.rs:443:15
    = note: inside closure at /home/alice/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.0.1/src/sync/mutex.rs:299:9
    = note: inside `<std::future::from_generator::GenFuture<[static generator@tokio::sync::Mutex<std::string::String>::acquire::{closure#0} for<'r, 's, 't0> {std::future::ResumeTy, &'r tokio::sync::Mutex<std::string::String>, tokio::sync::Mutex<std::string::String>, &'s tokio::sync::batch_semaphore::Semaphore, tokio::sync::batch_semaphore::Semaphore, u32, tokio::sync::batch_semaphore::Acquire<'t0>, ()}]> as futures::Future>::poll` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/mod.rs:80:19
    = note: inside closure at /home/alice/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.0.1/src/sync/mutex.rs:263:9
    = note: inside `<std::future::from_generator::GenFuture<[static generator@tokio::sync::Mutex<std::string::String>::lock::{closure#0} for<'r, 's> {std::future::ResumeTy, &'r tokio::sync::Mutex<std::string::String>, impl futures::Future, ()}]> as futures::Future>::poll` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/mod.rs:80:19
    = note: inside `<std::pin::Pin<&mut std::future::from_generator::GenFuture<[static generator@tokio::sync::Mutex<std::string::String>::lock::{closure#0} for<'r, 's> {std::future::ResumeTy, &'r tokio::sync::Mutex<std::string::String>, impl futures::Future, ()}]>> as futures::Future>::poll` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/future.rs:119:9
    = note: inside closure at /home/alice/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.0.1/src/runtime/basic_scheduler.rs:184:58
    = note: inside closure at /home/alice/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.0.1/src/coop.rs:121:9
    = note: inside `std::thread::LocalKey::<std::cell::Cell<tokio::coop::Budget>>::try_with::<[closure@tokio::coop::with_budget<std::task::Poll<tokio::sync::MutexGuard<std::string::String>>, [closure@tokio::runtime::basic_scheduler::Inner<tokio::runtime::driver::Driver>::block_on<std::pin::Pin<&mut std::future::from_generator::GenFuture<[static generator@tokio::sync::Mutex<std::string::String>::lock::{closure#0} for<'r, 's> {std::future::ResumeTy, &'r tokio::sync::Mutex<std::string::String>, impl futures::Future, ()}]>>>::{closure#0}::{closure#0}]>::{closure#0}], std::task::Poll<tokio::sync::MutexGuard<std::string::String>>>` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/local.rs:272:16
    = note: inside `std::thread::LocalKey::<std::cell::Cell<tokio::coop::Budget>>::with::<[closure@tokio::coop::with_budget<std::task::Poll<tokio::sync::MutexGuard<std::string::String>>, [closure@tokio::runtime::basic_scheduler::Inner<tokio::runtime::driver::Driver>::block_on<std::pin::Pin<&mut std::future::from_generator::GenFuture<[static generator@tokio::sync::Mutex<std::string::String>::lock::{closure#0} for<'r, 's> {std::future::ResumeTy, &'r tokio::sync::Mutex<std::string::String>, impl futures::Future, ()}]>>>::{closure#0}::{closure#0}]>::{closure#0}], std::task::Poll<tokio::sync::MutexGuard<std::string::String>>>` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/local.rs:248:9
    = note: inside `tokio::coop::with_budget::<std::task::Poll<tokio::sync::MutexGuard<std::string::String>>, [closure@tokio::runtime::basic_scheduler::Inner<tokio::runtime::driver::Driver>::block_on<std::pin::Pin<&mut std::future::from_generator::GenFuture<[static generator@tokio::sync::Mutex<std::string::String>::lock::{closure#0} for<'r, 's> {std::future::ResumeTy, &'r tokio::sync::Mutex<std::string::String>, impl futures::Future, ()}]>>>::{closure#0}::{closure#0}]>` at /home/alice/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.0.1/src/coop.rs:114:5
    = note: inside `tokio::coop::budget::<std::task::Poll<tokio::sync::MutexGuard<std::string::String>>, [closure@tokio::runtime::basic_scheduler::Inner<tokio::runtime::driver::Driver>::block_on<std::pin::Pin<&mut std::future::from_generator::GenFuture<[static generator@tokio::sync::Mutex<std::string::String>::lock::{closure#0} for<'r, 's> {std::future::ResumeTy, &'r tokio::sync::Mutex<std::string::String>, impl futures::Future, ()}]>>>::{closure#0}::{closure#0}]>` at /home/alice/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.0.1/src/coop.rs:98:5
    = note: inside closure at /home/alice/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.0.1/src/runtime/basic_scheduler.rs:184:35
    = note: inside closure at /home/alice/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.0.1/src/runtime/basic_scheduler.rs:266:29
    = note: inside `tokio::macros::scoped_tls::ScopedKey::<tokio::runtime::basic_scheduler::Context>::set::<[closure@tokio::runtime::basic_scheduler::enter<[closure@tokio::runtime::basic_scheduler::Inner<tokio::runtime::driver::Driver>::block_on<std::pin::Pin<&mut std::future::from_generator::GenFuture<[static generator@tokio::sync::Mutex<std::string::String>::lock::{closure#0} for<'r, 's> {std::future::ResumeTy, &'r tokio::sync::Mutex<std::string::String>, impl futures::Future, ()}]>>>::{closure#0}], tokio::sync::MutexGuard<std::string::String>, tokio::runtime::driver::Driver>::{closure#0}], tokio::sync::MutexGuard<std::string::String>>` at /home/alice/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.0.1/src/macros/scoped_tls.rs:61:9
    = note: inside `tokio::runtime::basic_scheduler::enter::<[closure@tokio::runtime::basic_scheduler::Inner<tokio::runtime::driver::Driver>::block_on<std::pin::Pin<&mut std::future::from_generator::GenFuture<[static generator@tokio::sync::Mutex<std::string::String>::lock::{closure#0} for<'r, 's> {std::future::ResumeTy, &'r tokio::sync::Mutex<std::string::String>, impl futures::Future, ()}]>>>::{closure#0}], tokio::sync::MutexGuard<std::string::String>, tokio::runtime::driver::Driver>` at /home/alice/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.0.1/src/runtime/basic_scheduler.rs:266:5
    = note: inside `tokio::runtime::basic_scheduler::Inner::<tokio::runtime::driver::Driver>::block_on::<std::pin::Pin<&mut std::future::from_generator::GenFuture<[static generator@tokio::sync::Mutex<std::string::String>::lock::{closure#0} for<'r, 's> {std::future::ResumeTy, &'r tokio::sync::Mutex<std::string::String>, impl futures::Future, ()}]>>>` at /home/alice/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.0.1/src/runtime/basic_scheduler.rs:176:9
    = note: inside `tokio::runtime::basic_scheduler::InnerGuard::<tokio::runtime::driver::Driver>::block_on::<std::pin::Pin<&mut std::future::from_generator::GenFuture<[static generator@tokio::sync::Mutex<std::string::String>::lock::{closure#0} for<'r, 's> {std::future::ResumeTy, &'r tokio::sync::Mutex<std::string::String>, impl futures::Future, ()}]>>>` at /home/alice/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.0.1/src/runtime/basic_scheduler.rs:405:9
    = note: inside `tokio::runtime::basic_scheduler::BasicScheduler::<tokio::runtime::driver::Driver>::block_on::<std::future::from_generator::GenFuture<[static generator@tokio::sync::Mutex<std::string::String>::lock::{closure#0} for<'r, 's> {std::future::ResumeTy, &'r tokio::sync::Mutex<std::string::String>, impl futures::Future, ()}]>>` at /home/alice/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.0.1/src/runtime/basic_scheduler.rs:136:24
    = note: inside `tokio::runtime::Runtime::block_on::<std::future::from_generator::GenFuture<[static generator@tokio::sync::Mutex<std::string::String>::lock::{closure#0} for<'r, 's> {std::future::ResumeTy, &'r tokio::sync::Mutex<std::string::String>, impl futures::Future, ()}]>>` at /home/alice/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.0.1/src/runtime/mod.rs:450:46
note: inside `main` at src/main.rs:17:5
   --> src/main.rs:17:5
    |
17  |     rt.block_on(m.lock());
    |     ^^^^^^^^^^^^^^^^^^^^^
    = note: inside `<fn() as std::ops::FnOnce<()>>::call_once - shim(fn())` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:227:5
    = note: inside `std::sys_common::backtrace::__rust_begin_short_backtrace::<fn(), ()>` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:125:18
    = note: inside closure at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:66:18
    = note: inside `std::ops::function::impls::<impl std::ops::FnOnce<()> for &dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe>::call_once` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:259:13
    = note: inside `std::panicking::r#try::do_call::<&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe, i32>` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:379:40
    = note: inside `std::panicking::r#try::<i32, &dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe>` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:343:19
    = note: inside `std::panic::catch_unwind::<&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe, i32>` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:396:14
    = note: inside `std::rt::lang_start_internal` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:51:25
    = note: inside `std::rt::lang_start::<()>` at /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:65:5

error: aborting due to previous error
@Darksonn Darksonn added C-bug Category: This is a bug. A-tokio Area: The main tokio crate M-sync Module: tokio/sync I-unsound 💥 A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness labels Jan 10, 2021
@hawkw
Copy link
Member

hawkw commented Jan 10, 2021

yeah, i think you're right that the waiter should be in an UnsafeCell. this should actually be pretty straightforward to fix, i think. good catch, @kprotty!

@blasrodri
Copy link
Contributor

yeah, i think you're right that the waiter should be in an UnsafeCell. this should actually be pretty straightforward to fix, i think. good catch, @kprotty!

Tried following your guide, and still fails. What am I missing?

diff --git a/tokio/src/sync/batch_semaphore.rs b/tokio/src/sync/batch_semaphore.rs
index 803f2a18..49d562bc 100644
--- a/tokio/src/sync/batch_semaphore.rs
+++ b/tokio/src/sync/batch_semaphore.rs
@@ -20,6 +20,7 @@ use crate::loom::sync::atomic::AtomicUsize;
 use crate::loom::sync::{Mutex, MutexGuard};
 use crate::util::linked_list::{self, LinkedList};
 
+use std::cell::UnsafeCell as UnsafeCellSTD;
 use std::future::Future;
 use std::marker::PhantomPinned;
 use std::pin::Pin;
@@ -65,7 +66,7 @@ pub enum TryAcquireError {
 pub struct AcquireError(());
 
 pub(crate) struct Acquire<'a> {
-    node: Waiter,
+    node: UnsafeCellSTD<Waiter>,
     semaphore: &'a Semaphore,
     num_permits: u32,
     queued: bool,
@@ -458,7 +459,7 @@ impl Future for Acquire<'_> {
 impl<'a> Acquire<'a> {
     fn new(semaphore: &'a Semaphore, num_permits: u32) -> Self {
         Self {
-            node: Waiter::new(num_permits),
+            node: UnsafeCellSTD::new(Waiter::new(num_permits)),
             semaphore,
             num_permits,
             queued: false,
@@ -476,7 +477,7 @@ impl<'a> Acquire<'a> {
 
             let this = self.get_unchecked_mut();
             (
-                Pin::new_unchecked(&mut this.node),
+                Pin::new_unchecked(this.node.get_mut()),
                 &this.semaphore,
                 this.num_permits,
                 &mut this.queued,
@@ -499,11 +500,11 @@ impl Drop for Acquire<'_> {
         let mut waiters = self.semaphore.waiters.lock();
 
         // remove the entry from the list
-        let node = NonNull::from(&mut self.node);
+        let node = NonNull::from(self.node.get_mut());
         // Safety: we have locked the wait list.
         unsafe { waiters.queue.remove(node) };
 
-        let acquired_permits = self.num_permits as usize - self.node.state.load(Acquire);
+        let acquired_permits = self.num_permits as usize - self.node.get_mut().state.load(Acquire);
         if acquired_permits > 0 {
             self.semaphore.add_permits_locked(acquired_permits, waiters);
         }

@Darksonn
Copy link
Contributor Author

Further analysis has showed that it is not so simple.

@kprotty
Copy link

kprotty commented Jan 15, 2021

@blasrodri The usage of get_mut on the UnsafeCell there looks like an issue. It becomes more vivid where an atomic load, which implies shared references, is being done on a mutable reference.

Worst case (just speculation) this might be more fundamental of a change. With the existence of Pin<&mut Self> from the Future trait, and the need to have other threads performing updates concurrently on the memory of something in that Pin<&mut Self>, you would need to have that thread-shared memory say to the compiler that "a mutable reference and immutable references to this is not UB" which UnsafeCell doesn't seem to currently denote. This would also imply that things like futures-intrusive are unsound as well.

Fwiw, this pattern also appears in sync::Notify iirc so it may not be only this Future type to address. One way to handle this is by heap allocating the shared state in order to keep the "only shared references" property in the face of Pin<&mut Self>. While this addresses it, its a poor compromise to introduce runtime overhead simply due to not enough language support.

@Darksonn
Copy link
Contributor Author

Relevant thread: Are sound self-referential structs possible without boxing?

The answer appears to be "No, it's not possible".

@Darksonn
Copy link
Contributor Author

Darksonn commented Feb 4, 2022

Miri should stop complaining about this once #4397 is merged.

@kprotty
Copy link

kprotty commented Feb 5, 2022

FWIW, It's possible to fix this soundness issue while still using intrusive Futures. You just need to store the intrusive Node outside the Future that's being polled, then have said Future reference it. This avoids the Pin<&mut Self> transitively creating a mutable reference to the intrusive Node. In reality however, the Node would still be stored in the async fn compiler's generated Future which would still have this issue of transitive pin mut ref when polled but at least its outside user written code and now a fundamental issue with Rust async.

@Darksonn
Copy link
Contributor Author

Darksonn commented Feb 6, 2022

In general, we are waiting for a resolution to the problem where async Rust itself is unsound. I don't think what you suggested changes anything regarding what miri will accept.

@udoprog

This comment was marked as resolved.

@Darksonn

This comment was marked as resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate C-bug Category: This is a bug. I-unsound 💥 A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness M-sync Module: tokio/sync
Projects
None yet
Development

No branches or pull requests

5 participants