Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test marker traits more rigorously on public API types #756

Open
BurntSushi opened this issue Apr 8, 2021 · 17 comments
Open

test marker traits more rigorously on public API types #756

BurntSushi opened this issue Apr 8, 2021 · 17 comments

Comments

@BurntSushi
Copy link
Member

Since the regex crate uses interior mutability and a bit of synchronization, it follows that marker traits like Send, Sync, UnwindSafe and RefUnwindSafe are quite relevant and can change based on internal implementation details. #749 for example, removed the thread_local dependency and got a bit more rigorous about these marker traits on the main Regex/RegexSet types, but stopped there.

It turns out other types are impacted too. @kangalioo noted that, for example, Send was not implemented on the CaptureMatches iterator in regex 1.3.9, but it is now implemented in regex 1.4.5. I'm actually not sure why it wasn't implemented before. The regex crate doesn't use any non-Send stuff, so I'm not sure where that was coming from. I haven't investigated yet.

Either way, this is a compatibility hazard. Going from !Send to Send is fine, but it seems just as likely that we could regress too. The only way I can think to solve this is to assert expected marker traits for all public API types. It seems kind of absurd and tedious, but I don't have any better ideas.

@alexfertel
Copy link
Contributor

Can I take a jab at implementing these tests? If it's ok, can you give me some pointers on where I can find similar tests?

@BurntSushi
Copy link
Member Author

I think tests/test_default.rs is the place. You'll want to look for oibits, which is legacy terminology for marker traits that meant "Opt-In Built-In Traits."

Ideally there would be some way to write a test (maybe using shell script or Python, not sure) that would fail in the case of a public API type that doesn't have an oibit test. That way, if a new public API type is added, we're forced to add an oibit test for it. But maybe this is too much machinery.

@alexfertel
Copy link
Contributor

Thank you for the quick response!

So, if I understood correctly, we want to avoid having to repeat groups of assertions like:

    assert_send::<Regex>();
    assert_sync::<Regex>();
    assert_unwind_safe::<Regex>();
    assert_ref_unwind_safe::<Regex>();
    assert_send::<RegexBuilder>();
    assert_sync::<RegexBuilder>();
    assert_unwind_safe::<RegexBuilder>();
    assert_ref_unwind_safe::<RegexBuilder>();

by writing a script that takes a list (or somehow knows about every public API type) and generates the list of assertions?

Sounds good to me, and it is very interesting. Let me know if this is the case.

@BurntSushi
Copy link
Member Author

While avoiding repetition is a good idea, that should probably be done with a macro.

The purpose of the script is not about repetition. It's about automation. Consider the case of when a new public API type is added. How does that person know to go add it to the oibit traits? They don't. It would be good to have a test that fails if a public API type exists and is not in the oibit test.

With that said, I do not want anything that is terribly complex, difficult to maintain or brings in new dependencies. If there isn't a simple solution for it, then it might not be worth it. If you do want to pursue this extra automation step, then please discuss your plan with me before you invest too much effort.

But the main objective is to get oibit tests for each public API type.

@alexfertel
Copy link
Contributor

alexfertel commented Jun 30, 2021

Yeap, I agree. I also thought about a macro but wasn't sure. 😅

Ok, so, then this is divided into two tasks:

Tasks

  1. Tedious repetition of the group of assertions for each public API type 👍🏻 (Baseline)
  2. Some way of scanning public API types so we can generate tests that fail when types aren't in the oibit test. (Needs research)
  3. Add a macro to avoid repetition. (I think we don't need that, see Ideas)

Constraints for the automation:

  • Reasonably simple
  • Maintainable
  • Zero dependencies

Ideas

My idea for task 2 right now is to check how does documentation for types gets generated, and maybe somehow extract the types from there? I'm pretty new to Rust so I don't know what introspection abilities it has.

I think task 3 might be avoided by doing something like:

fn assert_oibits<T: Send + Sync + UnwindSafe + RefUnwindSafe>() {}

assert_oibits::<Regex>();

Let me know if all of this makes sense.

@alexfertel
Copy link
Contributor

What I got for task 2 so far is:

  1. Writing a Bash or Python script that generates docs and parses the Auto Trait Implementations section of each type to see if it implements what we want.
  2. Adding a separate crate that depends on some parser, parses the lib and generates assertions.
  3. static_assertions

Point 2 seems way overcomplicated. Points 2 and 3 add dependencies. So, I think the best one, right now, is point 1.

@BurntSushi
Copy link
Member Author

Combining the marker traits into one routine is a good approach. I think all public API types in this crate should satisfy all four of those? Hmm, maybe not, the iterators might not.

Ideally, rustdoc (perhaps through cargo doc) could be convinced to just give us a simple list of every public API type in a crate. @GuillaumeGomez or @jyn514, is that possible?

@jyn514
Copy link
Member

jyn514 commented Jul 1, 2021

@BurntSushi you might be able to get something to work with cargo rustdoc -- -Z unstable-options --output-format json. I'm not intimately familiar with the JSON schema so you'll need to play around with it a bit.

@BurntSushi
Copy link
Member Author

@jyn514 Thank you! @alexfertel perhaps give that a whirl?

@jyn514
Copy link
Member

jyn514 commented Jul 1, 2021

See also the ugly hack tokio uses (and the discussion about showing this in rustdoc): rust-lang/rfcs#2963 (comment)

@alexfertel
Copy link
Contributor

Yeap, I'll give it a try!

@alexfertel
Copy link
Contributor

@BurntSushi Yeap, I managed to get the public types from the generated JSON. 😁 🎉

I'll try to have a Python script working in the next couple of days, and I'll submit a PR. Would that be ok?

@BurntSushi
Copy link
Member Author

SGTM! Thanks. :)

@alexfertel
Copy link
Contributor

@BurntSushi I will use the function assert_oibits in the generated tests. The reason being that while inspecting the generated docs, I found that, as of today, they are implemented by all of the public API types (I am not sure if this is the correct terminology).

fn assert_oibits<T: Send + Sync + UnwindSafe + RefUnwindSafe>() {}

@BurntSushi
Copy link
Member Author

Ah that's great. Makes it easy.

I suppose it would be good to switch from oibits to marker_traits. So, assert_marker_traits instead of assert_oibits.

@alexfertel
Copy link
Contributor

Well, scratch that last comment, I generated the tests and some of the types aren't Sync, as you thought. The issue is that I was checking for the existence of the trait as a child node in the JSON. However, there is an is_negative flag that I was ignoring.

My idea was to check all the traits for every type, but now it's not so easy. We're not solving the problem of "Having a test that fails if a public API type exists and is not in the oibit test", because each time we run the script, the tests have to be generated based on the traits implemented by the types, not just the types.

The solution I see is the following: The first time we generate the tests, we maintain a map of the traits implemented by the types. We can only insert to this map if there are new types exposed by the crate.

Do you see another way of solving this?

@BurntSushi
Copy link
Member Author

@alexfertel I think the key simplification here is that we don't need a robust script to generate the tests. Technically, writing them by hand would be just fine. What we need is way to check for the existence of a test. Since we control what the tests look like, that seems solvable with grep. If a test doesn't exist for a particular public API type, then CI should fail. That's it. We don't need a script that re-generates the tests. A human can add them incrementally. In particular, the regex public API experiences very few changes. For example, I don't think there have been any new public API types added in the last couple years, so there's no need to automate that part. Maybe we automate the initial creation of it, and it would be nice to at least document a key shell command you used to do that piece. But for example, you could assume every type implements all four marker traits, and then just hand massage the types that don't. That's totally fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants