Parse '-' in numeric values used in test case #19

frondeus · 2019-10-01T11:09:05Z

For:

#[test_case(15, 15)]
#[test_case(-15, 15)]

fn test_abs(value: f64, expected: f64) {
   assert_eq!(value.abs(), expected)
}

We get error:

error[E0428]: the name _15_15 is defined multiple times
--> src/test.rs
|
| #[test_case(15, 15)]
| ^^^^^^^^^^^^^^^^^^^^
| |
| previous definition of the value _15_15 here
| _15_15 redefined here
|
= note: _15_15 must be defined only once in the value namespace of this mod

The text was updated successfully, but these errors were encountered:

frondeus · 2019-10-01T11:09:30Z

synek317/test-case-derive#1

AugustoFKL · 2024-01-04T15:51:01Z

@luke-biel @frondeus I'm working on this issue, FYI.

AugustoFKL · 2024-01-20T18:51:15Z

Sorry the delay

@frondeus, @luke-biel, after some research, I figured that the main problem is how test names are handled by the escape_test_name method:

pub fn escape_test_name(input: impl AsRef<str>) -> Ident {
    if input.as_ref().is_empty() {
        return Ident::new("_empty", Span::call_site());
    }

    let mut last_under = false;
    let mut ident: String = input
        .as_ref()
        .to_ascii_lowercase()
        .chars()
        .filter_map(|c| match c {
            c if c.is_alphanumeric() => {
                last_under = false;
                Some(c.to_ascii_lowercase())
            }
            _ if !last_under => {
                last_under = true;
                Some('_')
            }
            _ => None,
        })
        .collect();

    if !ident.starts_with(|c: char| c == '_' || c.is_ascii_alphabetic()) {
        ident = format!("_{ident}");
    }

    Ident::new(&ident, Span::call_site())
}

There are two things that can cause problems, IMHO: The "None" of the match inside the filter_map and the if at before returning.

I imagine that the _ is added before the test name to avoid test names starting with invalid characters for test names, is that correct?

To be honest I only found two possible solutions to the general problem of conflicting names:

The library start using additional context information to generate test names.
We add an prefix/postfix to all tests to guarantee the uniqueness of names.

My proposal is to start adding an UUID to the end of all test names. This would be the less disruptive change when it comes to the current codebase. Adding contextual information would take a lot more effort and time, so the UUID would work as a bandaid to the problem in the meantime.

luke-biel · 2024-01-25T06:20:30Z

Hmm, I'm not sold on idea of adding an UUID, mainly because it's not stable, it'd change for every compilation and tools would not understand it (cargo would report to rerun a test with --test test_name_my_old_uuid, but next run would require "new_uuid"). We need something stable. May I suggest span or argument hashing (eg.: SHA1)?

AugustoFKL · 2024-01-25T14:51:11Z

@luke-biel I agree.

Actually, instead of hashing, I thought about using enconding:

fn encode_test_name(name: &str) -> String {
    // Encode the normalized string to match the allowed characters in the test name
    let mut encoded = BASE64_URL_SAFE_NO_PAD.encode(name.as_bytes());

    // Replace not acceptable characters with an identifiable string
    encoded = encoded.replace('-', "__dash__");

    // Add an underscore to the beginning of the string to make it a valid Rust
    // identifier when starting with a digit.
    encoded.insert(0, '_');

    encoded
}

fn decode_test_name(name: &str) -> Result<String> {
    // Remove the underscore from the beginning of the string
    let name = name.trim_start_matches('_');

    // Reverse the replacement of not acceptable characters
    let name = name.replace("__dash__", "-");

    // Decode the test name
    let decoded = BASE64_URL_SAFE_NO_PAD.decode(name.as_bytes())?;

    // Convert the decoded bytes to a string
    let decoded_str = String::from_utf8(decoded)?;

    Ok(decoded_str)
}

This code generates names based on the BASE64 encoding. I was able to verify and it supports the whole unicode specification (from 0x0000 to 0xFFFF), besides surrogates. Additionally, the encoded values are guaranteed to be unique, which means that will be no more problems with 'A' or 'a' or anything like that.

My only concern is that the names being encoded makes the tracing from the test results to actual tests pretty difficult. Therefore, I am looking into having a way to inform the 'name' (the one given by the user, if any, or the arguments that were used).

Does that sound reasonable?

Does that sound acceptable?

luke-biel · 2024-01-25T23:22:56Z

I'm a bit worried with ergonomics of this. How will this look like in the test output? If we were dealing with custom test harness we could write there anything, but in our scenario we have to rely on rust's mechanisms - and these display module path. Appending something at the end (or beginning) of the name would be a bit less confusing, even if it was that encoded string (though probably a bit long), test_case line number or hashed "something". Basically, out of all these options we have to choose something which will give most information to the user while avoiding name conflicts.

olson-sean-k · 2024-02-16T02:44:54Z

Just wanted to mention that I've also encountered this issue, though with &str arguments:

#[test_case("a*")]
#[test_case("*a")]
fn test(expression: &str) { ... }

It's very easy for this to occur with text, since it isn't too unlikely that it won't constitute a valid Rust identifier. A minimal tag derived from parameters sounds like a great idea, especially if it's stable. Maybe a truncated hash could provide enough disambiguation. From the example above, I think names like this could work well:

tests::test::_f8c61a7__a_expects
tests::test::_0b548fa__a_expects

That isn't too noisy in my opinion and the original format remains intact. Of course, it may be difficult to differentiate tests like this at a glance, but I think it's reasonable to lean on meaningful output when such tests fail to make the distinction sufficiently clear.

Thanks for looking into this!

IohannRabeson · 2024-02-21T04:27:51Z

Why not just adding an arbitrary index after the name?

tests::test::__a_0
tests::test::__a_1
tests::test::__b_2
tests::test::__b_3

It seems enough to make a distinction. I don't think there is a need for a hash.

luke-biel · 2024-02-21T13:01:43Z

Hmm, thinking more about it, encoding as per your suggestion @AugustoFKL would work, one caveat as @olson-sean-k mentioned, we'd have to cover a lot more than just dash. Most likely all non-ascii characters. Maybe use urlencode then? But we'd need to do something with % characters then.
Number appending works fine too, we can just add index in test_case vector to the function name.

To be honest, I'm fine with any solution as long as it's consistent between runs. Be it hashing, truncated hashes, indexes...

frondeus · 2024-02-21T13:23:35Z

Silly idea - what about line numbers :)?

IohannRabeson · 2024-02-21T19:30:41Z

Silly idea - what about line numbers :)?

Line numbers are not very stable, any change in the same file above the test can change them.

olson-sean-k · 2024-02-22T00:41:34Z

Why not just adding an arbitrary index after the name?

Assuming the cases can be stably enumerated in the procedural macro implementation, I think a case index is the best suggestion. By default, a very simple but reliable scheme is probably best. Anything else should probably be opt-in.

Sometimes the identifiers constructed from case parameters work pretty well... but often they aren't very legible and sometimes they break! It doesn't generalize and so doesn't make for a good default, I think.

GabrielSimonetto mentioned this issue Oct 31, 2021

Bad custom test names should not fail silently #72

Open

luke-biel added enhancement New feature or request spike Need to research topic more to come up with a solution labels Jan 28, 2022

luke-biel added this to To do in 2.1.0 Apr 22, 2022

luke-biel moved this from To do to Backlog in 2.1.0 May 12, 2022

This was referenced Jan 24, 2024

Uppercase and lowercase generated test function collide #108

Open

Conditional flags for ignore and inconclusive #141

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parse '-' in numeric values used in test case #19

Parse '-' in numeric values used in test case #19

frondeus commented Oct 1, 2019

frondeus commented Oct 1, 2019

AugustoFKL commented Jan 4, 2024

AugustoFKL commented Jan 20, 2024

luke-biel commented Jan 25, 2024 •

edited

AugustoFKL commented Jan 25, 2024

luke-biel commented Jan 25, 2024

olson-sean-k commented Feb 16, 2024

IohannRabeson commented Feb 21, 2024

luke-biel commented Feb 21, 2024 •

edited

frondeus commented Feb 21, 2024

IohannRabeson commented Feb 21, 2024

olson-sean-k commented Feb 22, 2024

Parse '-' in numeric values used in test case #19

Parse '-' in numeric values used in test case #19

Comments

frondeus commented Oct 1, 2019

frondeus commented Oct 1, 2019

AugustoFKL commented Jan 4, 2024

AugustoFKL commented Jan 20, 2024

luke-biel commented Jan 25, 2024 • edited

AugustoFKL commented Jan 25, 2024

luke-biel commented Jan 25, 2024

olson-sean-k commented Feb 16, 2024

IohannRabeson commented Feb 21, 2024

luke-biel commented Feb 21, 2024 • edited

frondeus commented Feb 21, 2024

IohannRabeson commented Feb 21, 2024

olson-sean-k commented Feb 22, 2024

luke-biel commented Jan 25, 2024 •

edited

luke-biel commented Feb 21, 2024 •

edited