Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No way to use filetype rules when piping with stdin #914

Open
orhun opened this issue Jan 20, 2024 · 12 comments
Open

No way to use filetype rules when piping with stdin #914

orhun opened this issue Jan 20, 2024 · 12 comments

Comments

@orhun
Copy link

orhun commented Jan 20, 2024

Repro case:

# typos.toml
[type.md]
extend-ignore-re = ["transfered"]
# file.md
transfered
$ typos file.md # works

$ typos --config typos.toml file.md # works

$ echo "transfered" | typos - # does not work

`transfered` should be `transferred`

$  echo "transfered" | typos --config typos.toml - # does not work

`transfered` should be `transferred`

I looked around in the codebase but couldn't figure out what causes stdin to be handled differently.

@dnaka91
Copy link

dnaka91 commented Jan 22, 2024

Just stumbled upon the same issue when adjusting my git-cliff config to integrate typos as a post-processing step (thanks @orhun for git-cliff).

What I found out with the --file-types flag is, that the type of a file is probably derived from the file extension. When using stdin it is always reported as -.

Meaning typos doesn't detect the file type from its content and probably needs an extra flag to define the file type when passing in the content through stdin.

@orhun
Copy link
Author

orhun commented Jan 22, 2024

Thanks for looking into this @dnaka91!

Indeed, file type isn't set when the content is given via stdin. I applied the following hacky patch to set it to md to see if it is going to make any difference:

diff --git a/crates/typos-cli/src/policy.rs b/crates/typos-cli/src/policy.rs
index e8fb2e9..e7b7d32 100644
--- a/crates/typos-cli/src/policy.rs
+++ b/crates/typos-cli/src/policy.rs
@@ -330,7 +330,7 @@ impl DirConfig {
                 );
                 self.default
             });
-        (name, config)
+        (Some("md"), config)
     }
 }

Unfortunately it didn't change anything. Maybe it should be set in a different way or we're not exactly on the right track.

@epage
Copy link
Collaborator

epage commented Jan 22, 2024

@orhun that only affected the returned name, it is still using the default config.

btw if you enable logging, the above log line shows up saying that it chose the default policy because an unknown file type was selected.

Could you describe the use case for what you are doing? For example, an extra flag for setting the file type means you need to do the file type detection. Would you instead want to tell us what the effective file name is so we still do our own detection? If that is the case, why can't we be given a file (I'm assuming its in memory?)?

@orhun
Copy link
Author

orhun commented Jan 22, 2024

The use case is fixing the the typos for the generated changelog via git-cliff as shown below:

[changelog]
postprocessors = [
  # Check typos with https://github.com/crate-ci/typos
  # If the spelling is incorrect, it will be automatically fixed.
  { pattern = '.*', replace_command = 'typos --write-changes -' },
]

This is handled by piping the output to the given command's stdin, the implementation is here: https://github.com/orhun/git-cliff/blob/7ae77ff0e0a22b5f5e42737204cbf0ab8680f9d7/git-cliff-core/src/command.rs#L48

As you might guess the problem is that the --config does not work like this:

{ pattern = '.*', replace_command = 'typos --config typos.toml --write-changes -' }

@orhun
Copy link
Author

orhun commented Feb 2, 2024

Is there any other info that I can provide to debug this further? 👀

@epage
Copy link
Collaborator

epage commented Feb 2, 2024

The question isn't too clear to me. What is being debugged by whom?

@orhun
Copy link
Author

orhun commented Feb 2, 2024

Sorry for not being clear. I'm trying to figure out what causes stdin to be handled differently in this case (for potentially fixing the issue) so I need some guidance. Let me know if I can provide any other information about my use-case for figuring out the next steps for making it possible.

@epage
Copy link
Collaborator

epage commented Feb 2, 2024

The difference is that there is no filename to do file type detection from.

Our options for solutions are

  • File type detection which would be iffy at best
  • A flag to receive a specific file type from --type-list
  • A flag to receive the file name. This could also be used in any messages.

Based on your workflow, it seems like either style of flag would work (--type md vs --stdin-file-name CHANGELOG.md).

I would be curious if there is any prior art around this; if there are other tools that need to know the file type for stdin how they handle it.

@orhun
Copy link
Author

orhun commented Feb 6, 2024

I'm not aware of any other tools. I think --type md flag would work perfectly in this case.

@epage epage changed the title Unable to ignore words when stdin is used No way to use filetype rules when piping with stdin Feb 11, 2024
@epage
Copy link
Collaborator

epage commented Feb 11, 2024

Someone just pointed out that bat has --file-name <name> Specify the name to display for a file.. That would also avoid callers having to figure out the file type (in your case you know it, but it could be more difficult in others).

I am open to merging a PR for a --file-name parameter that (1) is used for file type detection and (2) is shown on the screen.

@orhun
Copy link
Author

orhun commented Feb 17, 2024

I'm interested in making a PR but I'm not sure where to start. I added the command-line argument - as the next step should I override the file name if the argument is present? Can you provide some guidance about how to do that?

orhun added a commit to orhun/git-cliff that referenced this issue Feb 17, 2024
@epage
Copy link
Collaborator

epage commented Feb 19, 2024

For changing the selected policy, walk_entry needs to know this for determining the file name to use for engine.policy(lookup_path). Not seeing an obvious general solution to this so we'll probably just need to add another parameter to the call chain to include an override_name: Option<&Path>. We should only set this if the flag is used and only allow the flag to be used when reading from stdin.

For showing the name in the messages, we could

  • Rename report::MessageStatus to be more general
  • Tell it to use the file name for - (except maybe for json output?)
  • Overwrite the reported file name with this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants