Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the use of unsafe in the ecosystem #19

Open
Shnatsel opened this issue Jan 14, 2019 · 6 comments
Open

Reduce the use of unsafe in the ecosystem #19

Shnatsel opened this issue Jan 14, 2019 · 6 comments

Comments

@Shnatsel
Copy link
Member

Many widely used libraries use unsafe code where it's not strictly necessary. Typically this is done for performance reasons, i.e. there are currently no safe abstractions to achieve the goal safely and efficiently. The goal here is to reduce or eliminate the use of unsafe code throughout the ecosystem where it is not strictly necessary without regressing correctness or performance.

The per-crate process for this looks roughly like this:

  1. Investigate why unsafe is used in the first place. git blame usually helps with that by identifying a commit where a specific line is introduced.
  2. If it's because of performance, rig up a benchmarking harness to evaluate changes if it's not already present. This should take ~15 minutes, see criterion user guide. It conveniently supports comparison against a baseline.
  3. Try to rewrite unsafe code into safe
  4. Document your findings - what worked, what didn't, what additional safe abstractions could solve this. This can be used as an example, but you don't have to go into that much detail.

We want to run a lot of crates through this, so we also have some coordination tasks:

  1. Select high-value crates for analysis based on some criteria, like download count or some such.
  2. Set up a task tracker so that people can claim certain crates to avoid duplication of effort. A github repo owned by WG would do.
  3. Write a more clear guide and perhaps some samples so that the effort can be more widely advertised.
@DevQps
Copy link
Contributor

DevQps commented Mar 15, 2019

I think this is a nice goal! I guess it would be nice if we had some way to validate our results at the end of this year.

I saw Cargo Geiger being mentioned somewhere and I think it is a very nice tool to use to verify our results.

Maybe we could automatize running Cargo Geiger on the top X most downloaded crates on crates.io and store the results? This way we could see by how much unsafe statements would be reduced or increased.

Personally I also think that if we focus on creating good (safe) abstractions of unsafe operations or provide safe alternatives that are just as performant the amount of unsafe usage will automatically go down. If we could succeed in creating such things, we would only have to point them out to crate authors.

@joshlf
Copy link
Contributor

joshlf commented Mar 25, 2019

I don't have time to work on this right now, but here it is in case somebody else does:

I recently came across some code in the image crate which uses unsafe to extend the length of a vector, exposing uninitialized elements, and then initialize those elements using some fairly complex, difficult-to-reason-about code. This is done for performance reasons. Here's an example of where they do this.

I wonder if it would be possible to create a utility that allows doing this safely. I'm not sure exactly how it would work, but the idea would be to architect it so that you only modify the length once, and then allow read/write access to the initialized part of the vector while only allowing an initialize operation on elements in the uninitialized vector, and having that operation also have the side-effect of growing the initialized part of the vector. As a bonus, you could probably have a method that would initialize them all at once by calling a callback the right number of times or something like that, and such a method could probably be written to elide even more bounds checking.

@nico-abram
Copy link

@Shnatsel
Copy link
Member Author

We're starting a project to address this, initial results are very promising: https://github.com/rust-secure-code/safety-dance

@pinkforest
Copy link

pinkforest commented Nov 3, 2021

I am working on geiger.rs (yeah I bought the domain...) that geigers everything in crates.io

I also created a proposal to add hookpoints for metadata to link unsafe related blocks to Issues that can be checked against whether these are Closed status which could indicate someone has validated them

It also allows better organisation and visibility at geiger.rs I can track the blobs and see if the issues are updated if the unsafe blobs are changed somehow and then bot-pester for the crate repo owner to re-check whether the linked issue needs to be re-opened

geiger-rs/cargo-geiger#213

@Shnatsel
Copy link
Member Author

Shnatsel commented Nov 4, 2021

I am working on geiger.rs (yeah I bought the domain...) that geigers everything in crates.io

That is very impressive, but also going to end up quite expensive to run, because there is a lot of code on crates.io and you will need to compile all of it.

Also, there are security concerns because Cargo can run arbitrary code (e.g. proc macros, build scripts) during compilation.

Something like https://github.com/avadacatavra/unsafe-unicorn would be a cheaper but less precise alternative. This completes for the latest version of every crate on crates.io within hours on a regular desktop, but performs only rather basic textual analysis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants