Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Proposal: Task Dumps #5457

Open
jswrenn opened this issue Feb 14, 2023 · 5 comments
Open

Feature Proposal: Task Dumps #5457

jswrenn opened this issue Feb 14, 2023 · 5 comments
Labels
A-tokio Area: The main tokio crate C-feature-request Category: A feature request. M-taskdump --cfg tokio_taskdump

Comments

@jswrenn
Copy link
Contributor

jswrenn commented Feb 14, 2023

Major Changes:

  • 2023-04-18 Significantly revised implementation details. Added "Progress" section.
  • 2023-04-20 Replaced "Progress" with link to dedicated tracking issue.

In Java, thread dumps are an invaluable tool for debugging a stuck application in production. When a user requests a thread dump from a JVM, they receive text including the following information about each thread managed by the JVM:

  • The thread name.
  • The thread ID.
  • The thread's priority.
  • The thread's execution status: e.g., RUNNABLE, WAITING, or BLOCKED.
    This information is useful for debugging deadlocks.
  • The thread's call stack.

Thread dumps provide a starting point for debugging deadlocks. A programmer can thread dump a deadlocked program, identify the BLOCKED threads, and inspect their call stacks to identify the resources they are blocked on.

I propose bringing an analogue to Tokio: "task dumps". A task dump is a comprehensive snapshot of the state of a Tokio application. This proposal will allow tokio services to briefly pause their work, reflect upon their internal runtime state, and comprehensively report that state to their operator.

Guide-Level Explanation

The task dump API is gated behind --cfg tokio_unstable --cfg tokio_taskdump. A task dump is captured by invoking Handle::capture. For example, executing this program:

use std::hint::black_box;

#[inline(never)]
async fn a() {
    black_box(b()).await
}

#[inline(never)]
async fn b() {
    black_box(c()).await
}

#[inline(never)]
async fn c() {
    black_box(tokio::task::yield_now()).await
}

#[tokio::main(flavor = "current_thread")]
async fn main() {
    tokio::spawn(a());
    tokio::spawn(b());
    tokio::spawn(c());

    let handle = tokio::runtime::Handle::current();
    let dump = handle.dump();

    for (i, task) in dump.tasks().iter().enumerate() {
        let trace = task.trace();
        println!("task {i} trace:");
        println!("{trace}");
    }
}

Will produce an output like:

task 0 trace:
╼ dump::a::{{closure}} at /home/ubuntu/projects/tokio/examples/dump.rs:7:19
  └╼ dump::b::{{closure}} at /home/ubuntu/projects/tokio/examples/dump.rs:12:19
     └╼ dump::c::{{closure}} at /home/ubuntu/projects/tokio/examples/dump.rs:17:40
        └╼ tokio::task::yield_now::yield_now::{{closure}} at /home/ubuntu/projects/tokio/tokio/src/task/yield_now.rs:72:32
           └╼ <tokio::task::yield_now::yield_now::{{closure}}::YieldNow as core::future::future::Future>::poll at /home/ubuntu/projects/tokio/tokio/src/task/yield_now.rs:50:13
task 1 trace:
╼ dump::b::{{closure}} at /home/ubuntu/projects/tokio/examples/dump.rs:12:19
  └╼ dump::c::{{closure}} at /home/ubuntu/projects/tokio/examples/dump.rs:17:40
     └╼ tokio::task::yield_now::yield_now::{{closure}} at /home/ubuntu/projects/tokio/tokio/src/task/yield_now.rs:72:32
        └╼ <tokio::task::yield_now::yield_now::{{closure}}::YieldNow as core::future::future::Future>::poll at /home/ubuntu/projects/tokio/tokio/src/task/yield_now.rs:50:13
task 2 trace:
╼ dump::c::{{closure}} at /home/ubuntu/projects/tokio/examples/dump.rs:17:40
  └╼ tokio::task::yield_now::yield_now::{{closure}} at /home/ubuntu/projects/tokio/tokio/src/task/yield_now.rs:72:32
     └╼ <tokio::task::yield_now::yield_now::{{closure}}::YieldNow as core::future::future::Future>::poll at /home/ubuntu/projects/tokio/tokio/src/task/yield_now.rs:50:13

Progress

See #5638

Related Reading

@jswrenn jswrenn added A-tokio Area: The main tokio crate C-feature-request Category: A feature request. labels Feb 14, 2023
@sfackler
Copy link
Contributor

This is very exciting!

I'd personally find a Display implementation that forks a child, pauses threads, etc to be pretty strange. I thought the code example at the top of the issue just had a typo until I got down to the appendix! Having a pub fn dump_tasks(&self) -> TaskDump method on Handle seems much more readable while still allowing sufficient future flexibility.

It may be worth considering whether the ptrace side of things should be in scope, at least initially - the dump process could e.g. alternatively wait for a bit for well-behaved running tasks to yield and users could rely on a standard thread dumper (e.g. rstack-self or minidumper) to handle thread dumps of blocking code or badly behaved tasks.

@jswrenn
Copy link
Contributor Author

jswrenn commented Feb 15, 2023

Having a pub fn dump_tasks(&self) -> TaskDump method on Handle seems much more readable while still allowing sufficient future flexibility.

This is the route I originally took, but I encountered some challenges that lead me to the Display design. I'd appreciate your thoughts on these issues!

A task dump is meant to be an atomic snapshot of runtime state. In the Display approach, the steps of taking that snapshot and serializing it are combined and inseparable. In the approach you suggest, these steps are separated: the user must first call dump_tasks, then serialize TaskDump. At which of these two points is the snapshot taken?

If the snapshot is taken at the invocation of task_dump(), then TaskDump must either:

  1. lock the runtime's internals for as long as TaskDump is alive, or
  2. make and contain a copy of the runtime's internals

Both of these choices are expensive, and the first potentially introduces a deadlock hazard, too.

Alternatively, the snapshot could be taken when TaskDump is serialized. This approach is a little less problematic, but the UI/UX is weird: It means that calling task_dump() doesn't actually create a task dump.

These various complications lead me to consider if I could just fuse the snapshot and serialization steps together, hence my proposal for a Display implementation on Handle.

@sfackler
Copy link
Contributor

I would expect that calling dump_tasks() would take the snapshot since in the future we'd presumably want e.g. methods on TaskDump to programmatically interact with the dump.

I don't see why a task dump would inherently need to hold locks or store runtime internals - it could be as simple as just storing the JSON directly in a string or whatever for the MVP.

A display implementation is supposed to be the canonical textual representation of the type. It seems very strange to me that the canonical textual representation of a Handle is a JSON-formatted dump of the tasks running on the handle's associated runtime.

@hds
Copy link
Contributor

hds commented Feb 17, 2023

I think that it's worth stepping back and considering the possible uses of this information and how much dependency they have on the functionality that must be in tokio.

Providing an already serialized JSON string, assumes that the primary use for dumping the state of a runtime is outside of that application. This makes it more work for an application to introspect the state of its own runtime and act upon it (because the state would have to be deserialized again to be reusable). Additionally, this requires adding significant dependencies to tokio in order to support (serde, serde_json).

If instead, tokio were to provide its state as an object, that information could be used within the same application, and could also (albeit with more boilerplate) be serialized into a format suitable for export (e.g. JSON). This option has the advantage that it doesn't introduce additional dependencies.

I don't want to argue that adding new dependencies is a no go, especially when behind some feature flag, but I believe that if we consider the trade-offs, a first approach which provides an API that gives access to the important information (runtime snapshot state) in a more flexible manner and introducing fewer dependencies would be advisable. Additions (such as providing a serialized state) could then be considered on top of that API.

Regarding the concern that some TaskDump would need to keep a copy of the runtime state (as my opinion is that locking the runtime internals isn't a viable option), if we look at this piece-meal, then an initial output (such as the metrics in #5466), would certainly occupy less memory than a serialized version, and even as more data is added the write buffer is likely to occupy memory on the same order as some in memory representation - if it's more then I think the (wall clock) time to write the data out will become prohibitively expensive anyway. In a context where it is generally considered too long to block a single task for more than 100 micro-seconds, pausing the runtime to serialize large amounts of data (even if only sporadically) is likely to affect the behaviour of the runtime as much as what we would like to measure.

@Darksonn
Copy link
Contributor

Darksonn commented Mar 3, 2023

Here are my initial thoughts on this:

Task Iteration: sounds reasonable.
Task Spawn Location: sounds reasonable.
Task Traces: Sounds pretty intrusive. Details would need to be thought out properly to avoid the complexity from spiralling out of control.
Worker pausing: Sounds unnecessarily complicated to me. How about telling the worker to pause once the current future yields, then wait for that to happen? Add a timeout to handle blocking tasks.
Display impl: I would prefer a dedicated method for this.
JSON: We aren't going to add a dependency on serde. We can return a proper object with the relevant information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate C-feature-request Category: A feature request. M-taskdump --cfg tokio_taskdump
Projects
None yet
Development

No branches or pull requests

4 participants