Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix parsing of git output with unusual characters #509

Merged
merged 1 commit into from
Jul 30, 2020
Merged

Fix parsing of git output with unusual characters #509

merged 1 commit into from
Jul 30, 2020

Conversation

pawamoy
Copy link
Contributor

@pawamoy pawamoy commented Jul 29, 2020

I used the online editor to send the patch. Let me know if you'd like I add a test for it!

We could also consider moving the zsplit in the utils module.


Commit message:

On Windows, all files are "executable". Therefore, to know if a file is supposed to be executed, we check how its attributes were recorded by git: we run a git ls-files command in a subprocess.

By default, this command outputs information on multiple lines (file and their data separated by newlines). When a file contains an unusual character, the character is escaped with an integer sequence (such as \303\261), and git wraps the whole filename
in double-quotes because of the backslashes. It breaks the current code because we try to open the filename containing the double-quotes: it doesn't exist, of course.

Instead of trying to fix this special case by removing the double-quotes, and breaking other cases (a double-quote is a valid filename character on Linux), we tell git to separate each item with the null character \0 instead of a new line \n, with the option -z. With this option, git doesn't escape unusual characters with integer sequence, so the output is fixed, and we parse it by splitting on \0 instead of \n.

Fixes #508.

@pawamoy
Copy link
Contributor Author

pawamoy commented Jul 29, 2020

I used a ternary operator in the zsplit method to avoid coverage falling under 100. Unless we move zsplit in the utils module, I don't think testing it to make sure the else return [] branch is met with an input like \0\0 is useful, but if you still want it, I can add it 🙂

@asottile
Copy link
Member

I used a ternary operator in the zsplit method to avoid coverage falling under 100. Unless we move zsplit in the utils module, I don't think testing it to make sure the else return [] branch is met with an input like \0\0 is useful, but if you still want it, I can add it 🙂

testing this is important, it's the entire reason it's a separate function and not just .split('\0')

@pawamoy
Copy link
Contributor Author

pawamoy commented Jul 29, 2020

Fair enough, I'll write some tests 🙂

@pawamoy
Copy link
Contributor Author

pawamoy commented Jul 29, 2020

Test correctly failing without the fix:

_____________ test_check_git_filemode_passing_unusual_characters ______________

tmpdir = local('C:\\Users\\user\\AppData\\Local\\Temp\\1\\pytest-of-unknown\\pytest-6\\test_check_git_filemode_passin1')

    def test_check_git_filemode_passing_unusual_characters(tmpdir):
        with tmpdir.as_cwd():
            cmd_output('git', 'init', '.')

            f = tmpdir.join('ma▒ana.txt')
            f.write('#!/usr/bin/env bash')
            f_path = str(f)
            cmd_output('chmod', '+x', f_path)
            cmd_output('git', 'add', f_path)
            cmd_output('git', 'update-index', '--chmod=+x', f_path)

            files = (f_path,)
>           assert check_executables_have_shebangs._check_git_filemode(files) == 0

tests\check_executables_have_shebangs_test.py:88:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pre_commit_hooks\check_executables_have_shebangs.py:44: in _check_git_filemode
    has_shebang = _check_has_shebang(path)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

path = '"ma\\303\\261ana.txt"'

    def _check_has_shebang(path: str) -> int:
>       with open(path, 'rb') as f:
E       OSError: [Errno 22] Invalid argument: '"ma\\303\\261ana.txt"'

pre_commit_hooks\check_executables_have_shebangs.py:53: OSError

Copy link
Member

@asottile asottile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Windows, all files are "executable".
Therefore, to know if a file is supposed to be executed,
we check how its attributes were recorded by git:
we run a `git ls-files` command in a subprocess.

By default, this command outputs information
on multiple lines (file and their data separated by newlines).
When a file contains an unusual character,
the character is escaped with an integer sequence
(such as `\303\261`), and git wraps the whole filename
in double-quotes because of the backslashes.
It breaks the current code because we try to open
the filename containing the double-quotes:
it doesn't exist, of course.

Instead of trying to fix this special case by removing
the double-quotes, and breaking other cases
(a double-quote is a valid filename character on Linux),
we tell git to separate each item with the null character `\0`
instead of a new line `\n`, with the option `-z`.
With this option, git doesn't escape unusual characters
with integer sequence, so the output is fixed, and we
parse it by splitting on `\0` instead of `\n`.

Fixes #508.
@asottile asottile merged commit 9eab799 into pre-commit:master Jul 30, 2020
yajo pushed a commit to copier-org/copier that referenced this pull request Aug 3, 2020
- Ignore errors when executing `shutil.rmtree()`, because it seems like it's common to fail when deleting git repos on Windows, and since these are temp files, we don't really care that much there's garbage left. Any good OS should clean the temp folders automatically.
- Always find Jinja templates in `PosixPath` mode.
- Ignore `OSError` when trying to enter a possibly git root directory. This is yielded by Windows when the path is a URL and we don't really care about it.
- Fix some tests with non-windows haradcoded stuff.
- Fix a test that was using a Bash script. Modified to be Python, which should work fine cross-system.
- Do not modify EOL in CI.
- Use python executable in tests instead of python3.
- Update pre-commit versions to include pre-commit/pre-commit-hooks#509.
github-actions bot pushed a commit to copier-org/copier that referenced this pull request Aug 3, 2020
- Ignore errors when executing `shutil.rmtree()`, because it seems like it's common to fail when deleting git repos on Windows, and since these are temp files, we don't really care that much there's garbage left. Any good OS should clean the temp folders automatically.
- Always find Jinja templates in `PosixPath` mode.
- Ignore `OSError` when trying to enter a possibly git root directory. This is yielded by Windows when the path is a URL and we don't really care about it.
- Fix some tests with non-windows haradcoded stuff.
- Fix a test that was using a Bash script. Modified to be Python, which should work fine cross-system.
- Do not modify EOL in CI.
- Use python executable in tests instead of python3.
- Update pre-commit versions to include pre-commit/pre-commit-hooks#509.
yajo pushed a commit to copier-org/copier that referenced this pull request Aug 3, 2020
- Ignore errors when executing `shutil.rmtree()`, because it seems like it's common to fail when deleting git repos on Windows, and since these are temp files, we don't really care that much there's garbage left. Any good OS should clean the temp folders automatically.
- Always find Jinja templates in `PosixPath` mode.
- Ignore `OSError` when trying to enter a possibly git root directory. This is yielded by Windows when the path is a URL and we don't really care about it.
- Fix some tests with non-windows haradcoded stuff.
- Fix a test that was using a Bash script. Modified to be Python, which should work fine cross-system.
- Do not modify EOL in CI.
- Use python executable in tests instead of python3.
- Update pre-commit versions to include pre-commit/pre-commit-hooks#509.
yajo pushed a commit to copier-org/copier that referenced this pull request Aug 3, 2020
- Ignore errors when executing `shutil.rmtree()`, because it seems like it's common to fail when deleting git repos on Windows, and since these are temp files, we don't really care that much there's garbage left. Any good OS should clean the temp folders automatically.
- Always find Jinja templates in `PosixPath` mode.
- Ignore `OSError` when trying to enter a possibly git root directory. This is yielded by Windows when the path is a URL and we don't really care about it.
- Fix some tests with non-windows haradcoded stuff.
- Fix a test that was using a Bash script. Modified to be Python, which should work fine cross-system.
- Do not modify EOL in CI.
- Use python executable in tests instead of python3.
- Update pre-commit versions to include pre-commit/pre-commit-hooks#509.
- Disable autorebasing.
yajo pushed a commit to copier-org/copier that referenced this pull request Aug 4, 2020

Unverified

No user is associated with the committer email.
- Ignore errors when executing `shutil.rmtree()`, because it seems like it's common to fail when deleting git repos on Windows, and since these are temp files, we don't really care that much there's garbage left. Any good OS should clean the temp folders automatically.
- Always find Jinja templates in `PosixPath` mode.
- Ignore `OSError` when trying to enter a possibly git root directory. This is yielded by Windows when the path is a URL and we don't really care about it.
- Fix some tests with non-windows haradcoded stuff.
- Fix a test that was using a Bash script. Modified to be Python, which should work fine cross-system.
- Do not modify EOL in CI.
- Use python executable in tests instead of python3.
- Update pre-commit versions to include pre-commit/pre-commit-hooks#509.
- Disable autorebasing.
yajo pushed a commit to copier-org/copier that referenced this pull request Aug 4, 2020

Unverified

No user is associated with the committer email.
- Ignore errors when executing `shutil.rmtree()`, because it seems like it's common to fail when deleting git repos on Windows, and since these are temp files, we don't really care that much there's garbage left. Any good OS should clean the temp folders automatically.
- Always find Jinja templates in `PosixPath` mode.
- Ignore `OSError` when trying to enter a possibly git root directory. This is yielded by Windows when the path is a URL and we don't really care about it.
- Fix some tests with non-windows haradcoded stuff.
- Fix a test that was using a Bash script. Modified to be Python, which should work fine cross-system.
- Do not modify EOL in CI.
- Use python executable in tests instead of python3.
- Update pre-commit versions to include pre-commit/pre-commit-hooks#509.
- Disable autorebasing.
yajo pushed a commit to copier-org/copier that referenced this pull request Aug 4, 2020

Unverified

No user is associated with the committer email.
- Ignore errors when executing `shutil.rmtree()`, because it seems like it's common to fail when deleting git repos on Windows, and since these are temp files, we don't really care that much there's garbage left. Any good OS should clean the temp folders automatically.
- Always find Jinja templates in `PosixPath` mode.
- Ignore `OSError` when trying to enter a possibly git root directory. This is yielded by Windows when the path is a URL and we don't really care about it.
- Fix some tests with non-windows haradcoded stuff.
- Fix a test that was using a Bash script. Modified to be Python, which should work fine cross-system.
- Do not modify EOL in CI.
- Use python executable in tests instead of python3.
- Update pre-commit versions to include pre-commit/pre-commit-hooks#509.
- Disable autorebasing.
yajo pushed a commit to copier-org/copier that referenced this pull request Aug 4, 2020

Unverified

No user is associated with the committer email.
- Ignore errors when executing `shutil.rmtree()`, because it seems like it's common to fail when deleting git repos on Windows, and since these are temp files, we don't really care that much there's garbage left. Any good OS should clean the temp folders automatically.
- Always find Jinja templates in `PosixPath` mode.
- Ignore `OSError` when trying to enter a possibly git root directory. This is yielded by Windows when the path is a URL and we don't really care about it.
- Fix some tests with non-windows haradcoded stuff.
- Fix a test that was using a Bash script. Modified to be Python, which should work fine cross-system.
- Do not modify EOL in CI.
- Use python executable in tests instead of python3.
- Update pre-commit versions to include pre-commit/pre-commit-hooks#509.
- Disable autorebasing.
yajo pushed a commit to copier-org/copier that referenced this pull request Aug 4, 2020

Unverified

No user is associated with the committer email.
- Ignore errors when executing `shutil.rmtree()`, because it seems like it's common to fail when deleting git repos on Windows, and since these are temp files, we don't really care that much there's garbage left. Any good OS should clean the temp folders automatically.
- Always find Jinja templates in `PosixPath` mode.
- Ignore `OSError` when trying to enter a possibly git root directory. This is yielded by Windows when the path is a URL and we don't really care about it.
- Fix some tests with non-windows hardcoded stuff.
- Fix a test that was using a Bash script. Modified to be Python, which should work fine cross-system.
- Remove external python dependencies (yaml, plumbum) from test task/migration files. These are available on Linux because it gets the python env from the venv, but on Windows, it uses the main python interpreter and breaks. After all, that's not very important here.
- Do not modify EOL in CI.
- Use python executable in tests instead of python3.
- Update pre-commit versions to include pre-commit/pre-commit-hooks#509.
- Disable autorebasing.
- Disable pre-commit on CI on Windows.
- Require python 3.8 on Windows, where `tempfile` supports autoremoving directories with read-only files.
yajo pushed a commit to copier-org/copier that referenced this pull request Aug 4, 2020

Unverified

No user is associated with the committer email.
- Ignore errors when executing `shutil.rmtree()`, because it seems like it's common to fail when deleting git repos on Windows, and since these are temp files, we don't really care that much there's garbage left. Any good OS should clean the temp folders automatically.
- Always find Jinja templates in `PosixPath` mode.
- Ignore `OSError` when trying to enter a possibly git root directory. This is yielded by Windows when the path is a URL and we don't really care about it.
- Fix some tests with non-windows hardcoded stuff.
- Fix a test that was using a Bash script. Modified to be Python, which should work fine cross-system.
- Remove external python dependencies (yaml, plumbum) from test task/migration files. These are available on Linux because it gets the python env from the venv, but on Windows, it uses the main python interpreter and breaks. After all, that's not very important here.
- Do not modify EOL in CI.
- Use python executable in tests instead of python3.
- Update pre-commit versions to include pre-commit/pre-commit-hooks#509.
- Disable autorebasing.
- Disable pre-commit on CI on Windows.
- Require python 3.8 on Windows, where `tempfile` supports autoremoving directories with read-only files.
yajo pushed a commit to copier-org/copier that referenced this pull request Aug 7, 2020

Unverified

No user is associated with the committer email.
- Ignore errors when executing `shutil.rmtree()`, because it seems like it's common to fail when deleting git repos on Windows, and since these are temp files, we don't really care that much there's garbage left. Any good OS should clean the temp folders automatically.
- Always find Jinja templates in `PosixPath` mode.
- Ignore `OSError` when trying to enter a possibly git root directory. This is yielded by Windows when the path is a URL and we don't really care about it.
- Fix some tests with non-windows hardcoded stuff.
- Fix a test that was using a Bash script. Modified to be Python, which should work fine cross-system.
- Remove external python dependencies (yaml, plumbum) from test task/migration files. These are available on Linux because it gets the python env from the venv, but on Windows, it uses the main python interpreter and breaks. After all, that's not very important here.
- Do not modify EOL in CI.
- Use python executable in tests instead of python3.
- Update pre-commit versions to include pre-commit/pre-commit-hooks#509.
- Disable autorebasing.
- Disable pre-commit on CI on Windows.
- Require python 3.8 on Windows, where `tempfile` supports autoremoving directories with read-only files.
github-actions bot pushed a commit to copier-org/copier that referenced this pull request Aug 8, 2020
- Ignore errors when executing `shutil.rmtree()`, because it seems like it's common to fail when deleting git repos on Windows, and since these are temp files, we don't really care that much there's garbage left. Any good OS should clean the temp folders automatically.
- Always find Jinja templates in `PosixPath` mode.
- Ignore `OSError` when trying to enter a possibly git root directory. This is yielded by Windows when the path is a URL and we don't really care about it.
- Fix some tests with non-windows hardcoded stuff.
- Fix a test that was using a Bash script. Modified to be Python, which should work fine cross-system.
- Remove external python dependencies (yaml, plumbum) from test task/migration files. These are available on Linux because it gets the python env from the venv, but on Windows, it uses the main python interpreter and breaks. After all, that's not very important here.
- Do not modify EOL in CI.
- Use python executable in tests instead of python3.
- Update pre-commit versions to include pre-commit/pre-commit-hooks#509.
- Disable autorebasing.
- Disable pre-commit on CI on Windows.
- Require python 3.8 on Windows, where `tempfile` supports autoremoving directories with read-only files.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Filenames with unusual characters break check_executables_have_shebangs
2 participants