chore(dist/features): ship `tracing` and friends by default #3803

rami3l · 2024-05-02T11:24:27Z

Part of #3790.

Rationale

Currently, helping out the Rustup team by enabling local tracing is quite a tedious process (esp. for community contributors), requiring rebuilding Rustup from the exact commit with an extra feature, otel:

rustup/doc/dev-guide/src/tracing.md

Lines 13 to 36 in 54dd3d0

    
           ## Usage 
        
           The normal [OTLP environment 
        
           variables](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/exporter.md) 
        
           can be used to customise its behaviour, but often the simplest thing is to just 
        
           run a Jaeger docker container on the same host: 
        
           ```sh 
        
           docker run -d --name jaeger   -e COLLECTOR_ZIPKIN_HOST_PORT=:9411   -e COLLECTOR_OTLP_ENABLED=true   -p 6831:6831/udp   -p 6832:6832/udp   -p 5778:5778   -p 16686:16686   -p 4317:4317   -p 4318:4318   -p 14250:14250   -p 14268:14268   -p 14269:14269   -p 9411:9411   jaegertracing/all-in-one:latest 
        
           ``` 
        
           Then build rustup-init with tracing: 
        
           ```sh 
        
           cargo build --features=otel 
        
           ``` 
        
           Run the operation you want to analyze: 
        
           ```sh 
        
           RUSTUP_FORCE_ARG0="rustup" ./target/debug/rustup-init show 
        
           ``` 
        
           And [look in Jaeger for a trace](http://localhost:16686/search?service=rustup).

After some experiment, it turned out that we actually can ship the tracing features by default without forcing the user to face OTEL connection errors on a daily basis.

To clarify, this does not mean Rustup is setting up a central (a.k.a. phone-home-style) telemetry mechanism, and we will keep the tracing disabled by default unless RUST_LOG has been explicitly set.

Concerns

Should we eliminate RUSTUP_DEBUG in favor of RUST_LOG=trace? (Yes.)
~~Should we remove opentelemetry while keeping tracing (refactor(download): remove curl backend #3788 (comment))? If we should, the otel feature should be renamed.~~ (Not yet, see chore(dist/features): ship tracing and friends by default #3803 (comment).)

djc · 2024-05-02T12:04:56Z

I think we should use a basic console-based tracing-subscriber setup:

pub(super) fn subscribe() -> tracing::subscriber::DefaultGuard {
    let sub = tracing_subscriber::FmtSubscriber::builder()
        .with_max_level(tracing::Level::TRACE)
        .with_writer(|| TestWriter)
        .finish();
    tracing::subscriber::set_default(sub)
}

struct TestWriter;

impl Write for TestWriter {
    fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
        print!(
            "{}",
            str::from_utf8(buf).expect("tried to log invalid UTF-8")
        );
        Ok(buf.len())
    }
    fn flush(&mut self) -> io::Result<()> {
        io::stdout().flush()
    }
}

This is what I've been using in Quinn for many years. Every test just starts by calling let _guard = subscribe();, which has been a highly effective tool for test observability.

If we do this we can massively simplify the scaffolding for traces/logging:

Get rid of all the opentelemetry dependencies, which I don't think we need
Remove the use of the otel feature
Remove the custom test macro

rbtcollins · 2024-05-02T19:05:04Z

Currently, helping out the Rustup team by enabling local tracing is quite a tedious process (esp. for community contributors), requiring rebuilding Rustup from the exact commit with an extra feature, otel:

I don't think this is true. We've not asked people to build with otel enabled that I'm remembering. The OS level traces we use to debug fundamental problems are from strace / truss and other similar tools. Otel / tracing! is not deployed widely enough within rustup to be a replacement for such things.

rbtcollins · 2024-05-02T19:07:05Z

Get rid of all the opentelemetry dependencies, which I don't think we need
Remove the use of the otel feature
Remove the custom test macro

Please don't - while the OS level debugging is vital, for doing investigations on performance, having a nice report with spans and the detailed call tree is very useful, and since we configure it off by default it has very little overhead to maintain or build with. Really only the all-features-build test matters.

djc · 2024-05-02T19:33:09Z

Get rid of all the opentelemetry dependencies, which I don't think we need
Remove the use of the otel feature
Remove the custom test macro

Please don't - while the OS level debugging is vital, for doing investigations on performance, having a nice report with spans and the detailed call tree is very useful, and since we configure it off by default it has very little overhead to maintain or build with. Really only the all-features-build test matters.

How many times have you used it in the past year? IMO while custom test macro + opentelemetry dependencies may not impose run-time overhead for downstream users, it does impose significant maintenance overhead that may not be warranted for the additional insight compared to just tracing-subscriber built-in (and maybe tokio-console level output).

djc · 2024-05-02T19:34:24Z

Currently, helping out the Rustup team by enabling local tracing is quite a tedious process (esp. for community contributors), requiring rebuilding Rustup from the exact commit with an extra feature, otel:

I don't think this is true. We've not asked people to build with otel enabled that I'm remembering. The OS level traces we use to debug fundamental problems are from strace / truss and other similar tools. Otel / tracing! is not deployed widely enough within rustup to be a replacement for such things.

IMO the important point is that we should have a user-facing solution like "enable RUST_LOG=trace and give us the output of that", which seems like a decent method of getting better insight into problems that happen only in specific environments, which seems to be an important source of issues for rustup.

rami3l · 2024-05-03T05:49:50Z

I just checked and it looks like tokio-console doesn’t currently have a timeline view (tokio-rs/console#129), so I imagine opentelemetry and jaeger are here to stay for longer...

For now I plan to:

Ship tracing by default with a console-based subscriber.
If possible, reimplement our current logging system using that subscriber (with a fmt::layer() that mimics the original output style).
Keep the opentelemetry related stuff behind the otel feature.

More specifically, I imagine having multiple subscribers (tokio-rs/tracing#971) based on env vars and features:

A "classic" subscriber that targets process().stderr(), has a classic output format, will only print rustup log lines up to a certain level (info or verbose, depending on the input flags), and will be disabled if RUST_LOG is set.
A "tracing" subscriber that also targets process().stderr() and is not limited to rustup (so we could have tonic log lines as well, for example). Its activation is mutually exclusive with the "classic" subscriber, and the precise logging level will be controlled by RUST_LOG.
An OpenTelemetry subscriber (could be replaced by tokio-console in the future, but not just yet) available behind the otel feature, enabled simultaneously with the "tracing" subscriber.

Finally:

With the consistent use of RUST_LOG, RUSTUP_DEBUG should be retired accordingly.
We need to make sure this subscriber's use of CLI colors is coherent with the current system (incl. env variable controls via RUSTUP_TERM_COLOR, etc).

Waiting for #3367 might be worthwhile, since this will change the startup process.

djc · 2024-05-30T13:23:31Z

#3367 has been merged, would be good to rebase this!

…ut, pt. 2

rami3l added this to the 1.28.0 milestone May 2, 2024

rami3l mentioned this pull request May 2, 2024

Simplify download and/or TLS backends #3790

Open

7 tasks

This comment was marked as outdated.

Sign in to view

rami3l force-pushed the refactor/tracing branch from 7fb8f99 to 078795a Compare May 3, 2024 12:43

rami3l mentioned this pull request May 5, 2024

error: toolchain 'stable-x86_64-pc-windows-msvc' is not installable with rustup 1.27.0 on wine #3807

Closed

2 tasks

rami3l force-pushed the refactor/tracing branch 3 times, most recently from f50ac2f to 2ff50f8 Compare May 5, 2024 12:13

rami3l mentioned this pull request May 7, 2024

fix(filesource): make some constructs only available via the test feature #3811

Merged

This was referenced May 15, 2024

#3827 has broken the ETA format when downloading/installing components #3828

Closed

refactor(filesource): replace repetitive #[cfg()] usages with cfg_if!{} #3832

Closed

rami3l added 3 commits May 30, 2024 21:32

refactor(ci): use more target_cargo() in run.bash

83a7a66

chore(deps): make tracing* hard requirements

af02471

feat(rustup-init): (wip) use process().stderr() as a tracing output

07d4166

rami3l force-pushed the refactor/tracing branch from 2ff50f8 to 66501b8 Compare May 30, 2024 14:30

rami3l added 2 commits May 31, 2024 08:54

feat(rustup-init): (wip) use process().stderr() as a tracing outp…

1f0c7bd

…ut, pt. 2

refactor(log): reimplement log using tracing

1696372

rami3l force-pushed the refactor/tracing branch from 66501b8 to 1696372 Compare May 31, 2024 00:54

rami3l added 2 commits May 31, 2024 09:26

test: (wip) adapt unit tests to use tracing output

d297401

test: (wip) adapt unit tests to use tracing output, pt. 2

7fa691a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(dist/features): ship `tracing` and friends by default #3803

chore(dist/features): ship `tracing` and friends by default #3803

rami3l commented May 2, 2024 •

edited

This comment was marked as outdated.

djc commented May 2, 2024

rbtcollins commented May 2, 2024

rbtcollins commented May 2, 2024

djc commented May 2, 2024

djc commented May 2, 2024

rami3l commented May 3, 2024 •

edited

djc commented May 30, 2024

	## Usage

	The normal [OTLP environment
	variables](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/exporter.md)
	can be used to customise its behaviour, but often the simplest thing is to just
	run a Jaeger docker container on the same host:

	```sh
	docker run -d --name jaeger -e COLLECTOR_ZIPKIN_HOST_PORT=:9411 -e COLLECTOR_OTLP_ENABLED=true -p 6831:6831/udp -p 6832:6832/udp -p 5778:5778 -p 16686:16686 -p 4317:4317 -p 4318:4318 -p 14250:14250 -p 14268:14268 -p 14269:14269 -p 9411:9411 jaegertracing/all-in-one:latest
	```

	Then build rustup-init with tracing:

	```sh
	cargo build --features=otel
	```

	Run the operation you want to analyze:

	```sh
	RUSTUP_FORCE_ARG0="rustup" ./target/debug/rustup-init show
	```

	And [look in Jaeger for a trace](http://localhost:16686/search?service=rustup).

chore(dist/features): ship tracing and friends by default #3803

Are you sure you want to change the base?

chore(dist/features): ship tracing and friends by default #3803

Conversation

rami3l commented May 2, 2024 • edited

Rationale

Concerns

This comment was marked as outdated.

djc commented May 2, 2024

rbtcollins commented May 2, 2024

rbtcollins commented May 2, 2024

djc commented May 2, 2024

djc commented May 2, 2024

rami3l commented May 3, 2024 • edited

djc commented May 30, 2024

chore(dist/features): ship `tracing` and friends by default #3803

chore(dist/features): ship `tracing` and friends by default #3803

rami3l commented May 2, 2024 •

edited

rami3l commented May 3, 2024 •

edited