Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle extended path length syntax in automatic [CWD] detection #153

Open
kleinesfilmroellchen opened this issue Oct 15, 2022 · 5 comments
Labels
A-snapbox Area: snapbox package A-trycmd Area: trycmd package

Comments

@kleinesfilmroellchen
Copy link

Windows has the extended path length syntax to work around legacy file path restrictions. These paths start with the string \\?\ followed by a normal path; more details can be found here. The problem is that Rust's std::fs::canonicalize creates these kinds of paths on Windows (which makes sense given that paths over 260 characters are otherwise disallowed), but trycmd can't handle them. If my binary under test's output, on Windows, includes something like:

\\?\C:\Path\To\Working\Directory\Some\File.txt

trycmd will transform this into:

//?/[CWD]/Some/File.txt

However, on Posix-like systems, my binary's output looks like this:

/Path/To/Working/Directory/Some/File.txt

as the behaviour of fs::canonicalize is different (on all systems, canonicalization is handled by the operating system, in the case of Posix, via realpath). Therefore, trycmd "compresses" that into

[CWD]/Some/File.txt

Depending on whether the test was blessed on Windows or a Posix-like system, the tests will therefore fail on the other family of systems.

A workaround is of course to discard extended length path syntax in my binary before printing the path or file name, but as stated above that can lead to other problems. I'd like trycmd to be able to handle this; i.e. collapse the leading extended path syntax specifier into [CWD] if it is found before the normal working directory path.

kleinesfilmroellchen added a commit to kleinesfilmroellchen/spcasm that referenced this issue Oct 15, 2022
@epage
Copy link
Contributor

epage commented Oct 17, 2022

How come you are showing to users the unc syntax? Normally I would expect users to see the more common output.

As for resolving this, the challenge is dealing with both kinds of path syntaxes I think I'd like to keep the last-write-wins for insert_var so things can be overriden.

We could possibly add [UNC_ROOT] and [UNC_CWD]. These will run into the same annoyances as [EXE] where running on Linux gets one behavior ([ROOT]) but running on windows will need something else ([UNC_ROOT]) and annoys developers/contributors.

@kleinesfilmroellchen
Copy link
Author

kleinesfilmroellchen commented Oct 19, 2022

How come you are showing to users the unc syntax? Normally I would expect users to see the more common output.

I'm not sure what is wrong about extended path syntax or UNC paths; some paths require it (e.g. network shares, WSL directories) and my users are developers. Also, as said, non-extended paths have some limitations and nobody knows which programs will ignore those limitations nowadays and which don't.

We could possibly add [UNC_ROOT] and [UNC_CWD]. These will run into the same annoyances as [EXE] where running on Linux gets one behavior ([ROOT]) but running on windows will need something else ([UNC_ROOT]) and annoys developers/contributors.

Even though this fix is plausible, I'd like to know what the problem is with merging that into [CWD]. After all, absolute paths to the current working directory start with a / on Unix systems and never with a \. And on Windows, non-extended paths start with A-Z, so I'm confident that all three cases can be distinguished just by considering the first couple of characters.

@epage
Copy link
Contributor

epage commented Oct 19, 2022

I'm not sure what is wrong about extended path syntax or UNC paths; some paths require it (e.g. network shares, WSL directories) and my users are developers. Also, as said, non-extended paths have some limitations and nobody knows which programs will ignore those limitations nowadays and which don't.

Yes, there are cases where UNC paths are the only representation (ignoring length). When UNC is being used for length or "just because", that is not a representation that the user expects or is used to and its better to keep the path in a user friendly way.

Even though this fix is plausible, I'd like to know what the problem is with merging that into [CWD]. After all, absolute paths to the current working directory start with a / on Unix systems and never with a . And on Windows, non-extended paths start with A-Z, so I'm confident that all three cases can be distinguished just by considering the first couple of characters.

This assumes we are doing a hard coded parsing of the content. Instead, we are doing a straight variable substitution.

@kleinesfilmroellchen
Copy link
Author

This assumes we are doing a hard coded parsing of the content. Instead, we are doing a straight variable substitution.

Thanks for clearing that up, I'm not familiar with the inner workings of the library. Then I'm okay with this fix.

@epage
Copy link
Contributor

epage commented Oct 21, 2022

If we do separate variables, an interesting aspect is what we should match. I feel like I'd prefer we gave precedence to CWD and ROOT over the UNC versions but, as you saw, the UNC version includes the non-UNC version which makes that more difficult.

@epage epage added A-snapbox Area: snapbox package A-trycmd Area: trycmd package labels Feb 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-snapbox Area: snapbox package A-trycmd Area: trycmd package
Projects
None yet
Development

No branches or pull requests

2 participants