Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in replace? Text is removed #44

Closed
Tragen opened this issue Jun 27, 2019 · 10 comments · Fixed by #258
Closed

Bug in replace? Text is removed #44

Tragen opened this issue Jun 27, 2019 · 10 comments · Fixed by #258
Assignees
Labels
C-enhancement Category: New feature or request

Comments

@Tragen
Copy link

Tragen commented Jun 27, 2019

I'm using the Windows version 0.65 and this is the command. It's working with many other tools but not with sd.

set pattern="(.*)Call ((List|Text)\d+_Key(Down|Up))\((.*), (vbKeyReturn|13), (.*)\)"
set replace="$1Call $2($5, GetFM20ReturnKey(), $6)"
sd.exe %pattern% %replace% "Test.txt"

Test.txt contains this Text

Call Text3_KeyDown(Index, vbKeyReturn, 0)
    Call Text3_KeyDown(Index, vbKeyReturn, 0)
        Call Text3_KeyDown(Index, vbKeyReturn, 0)
Call Text3_KeyDown(Index, vbKeyReturn, 0)

It gets replaced with

 Text3_KeyDown(Index, GetFM20ReturnKey(), vbKeyReturn)
 Text3_KeyDown(Index, GetFM20ReturnKey(), vbKeyReturn)
 Text3_KeyDown(Index, GetFM20ReturnKey(), vbKeyReturn)
 Text3_KeyDown(Index, GetFM20ReturnKey(), vbKeyReturn)

I'm Missing the first capture group and the Call word.
Can you tell me if I'm doing something wrong or if it's a bug.
As other tools are working with the same pattern and replace strings, I think it's a bug in sd.

@chmln
Copy link
Owner

chmln commented Jun 28, 2019

@Tragen since sd supports more than index-based variable names (see third example), in your case $1Call was treated as a variable.

Just replace it with ${1}Call to resolve the ambiguity.

@Tragen
Copy link
Author

Tragen commented Jun 29, 2019

Thanks for the workaround.
Still looks like a bug to me.
I don't use named capture groups and even then, variables shouldn't start
with a number, then you don't have ambiguity.

@jeff-hykin
Copy link

Ran into same problem, have the same opinion, but I am glad to hear this was at least taken into consideration and not a hidden bug.

@CosmicHorrorDev
Copy link
Collaborator

CosmicHorrorDev commented May 11, 2023

The current behavior follows the regex crate which should generally be our source of truth for these kinds of situations. It's outlined here

https://docs.rs/regex/latest/regex/struct.Regex.html#method.replace

...

The longest possible name is used. e.g., $1a looks up the capture group named 1a and not the capture group at index 1. To exert more precise control over the name, use braces, e.g., ${1}a.

...

I do think we should document this caveat more clearly though, so I'll do that tomorrow

@jeff-hykin
Copy link

jeff-hykin commented May 11, 2023

Maybe there's a thread in the regex crate to discuss it. Probably hard to change the upstream now.

As someone with a lot of regex experience across Perl, Python, Ruby, Javascript, Oniguruma, Grep, Sed, etc I've never had an issue figuring out small differences like \1 instead of $1 or \k instead of \g, but it did take me quite a while to figure out what I was supposed to do. Mostly because "sometimes" the $1 worked and other times it didn't and the pattern was not obvious.

Documentation would help a bit but honestly, given how common bash/shell is I really doubt any first time user is going to see $1Call and think that it's one token. I don't know of any language that allows variables to start with a number.

I feel like a warning would be the most useful thing. Meaning, if a replace involves a capture group that doesn't exist, and the group starts with a number, just explain the whole situation of how to refer to numbered capture groups.

@CosmicHorrorDev
Copy link
Collaborator

Maybe there's a thread in the regex crate to discuss it. Probably hard to change the upstream now.

It dates all the way back to an early regex issue

rust-lang/regex#69

It looks like constructing a named capture starting with an integer is already disallowed by the regex crate, so I'm currently leaning towards having a hard error if the replacement text has a named capture group that starts with a number. The way to resolve it will still be to disambiguate the group, but this way the user will be immediately aware of what the issue is

This will be a breaking change, but I was already planning on making other breaking changes in the next release, so that's fine. Thanks for staying adamant @jeff-hykin

@CosmicHorrorDev CosmicHorrorDev self-assigned this May 12, 2023
@jeff-hykin
Copy link

Glad to hear about it!

with old school tools like sed being basically frozen, I feel like its the job of this next generation of CLI tools to be as intuitive as possible before they also eventually become frozen.

@CosmicHorrorDev CosmicHorrorDev added the M-needs triage Meta: Maintainer label me! label May 17, 2023
@CosmicHorrorDev CosmicHorrorDev added this to the v0.8.0 Release milestone May 17, 2023
@PeterlitsZo
Copy link

I comment like this: rust-lang/regex#69 (comment):

I think using braces to invoke the non-just-number variable will be a better idea.

  • $1asd = ${1}asd;
  • $asd = Just an error;
  • ${as} = OK.

Do you think it is a better idea?

@CosmicHorrorDev CosmicHorrorDev added C-enhancement Category: New feature or request and removed M-needs triage Meta: Maintainer label me! labels Oct 22, 2023
@CosmicHorrorDev
Copy link
Collaborator

Finally got around to handling this. I opted for a hard error when the user passes a capture group name that would be invalid. Here is what the error for the original usage looks like now

image

This means that we're now just a bit more strict than the other regex-related Rust tools that people are familiar with which I think is perfectly fine

@jeff-hykin
Copy link

Perfect 👌 thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants