Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRLF vs LF issue for input str #59

Closed
AV3RG opened this issue Dec 16, 2023 · 15 comments
Closed

CRLF vs LF issue for input str #59

AV3RG opened this issue Dec 16, 2023 · 15 comments

Comments

@AV3RG
Copy link

AV3RG commented Dec 16, 2023

If your IDE defaults to using CRLF instead of LF, which generally happens in Windows, the input str has an extra character at the end on all lines except the last one.

image

Like in the image, line 1 shows a length of 22 while it's 21, and line 2 shows the correct length as it is the last. It would be nice to see this template handle that issue as it causes regex expressions using \n to not work properly.

@fspoettel
Copy link
Owner

can you please post the regex you are using?

@AV3RG
Copy link
Author

AV3RG commented Dec 16, 2023

Yea sure

If we take this regex expression r"Time:\s+(?P<time>.+)\nDistance:\s+(?P<distance>.+)" which I used for day 6 to match required info from an input like this

Distance: 9 40 200
Time: 7 15 30

It won't work on CRLF mode because I am using \n in my regex. I have to either manually set the input example file to LF or change the regex to something like this r"Time:\s+(?P<time>.+)\s\nDistance:\s+(?P<distance>.+)" for it to work. An extra \s is needed in the regex for it to work

@AV3RG
Copy link
Author

AV3RG commented Dec 16, 2023

I also noticed something else which I feel like should be taken care of, all the inputs have an extra empty line at the end which is really not needed when passing it in the code, should we also trim the input str before passing it down to the solution functions

@fspoettel
Copy link
Owner

fspoettel commented Dec 16, 2023

thank you for the details. are these issues also present when you use aoc-cli directly to download the input?

sorry for the questions, I don't have access to a windows machine atm.

@AV3RG
Copy link
Author

AV3RG commented Dec 16, 2023

The issue with crlf is present in the examples folder only not the input folder files. Actually now that I think of it, its possible that its the git setting that makes that happen. But still the str should not contain the extra character in that case as well.

The extra line issue occurs in the input folder generated by aoc cli

@AV3RG
Copy link
Author

AV3RG commented Dec 17, 2023

Do you think it would be an ideal solution to just replace all \r\n with \n in the input files before running any tests or solutions? Could also trim it at the same time. This will fix both issues

Happy to make a pr for the same

@fspoettel
Copy link
Owner

fspoettel commented Dec 17, 2023

The extra line issue

The input files downloaded from the AOC website have a final trailing newline as well. I think we should not alter the input files and leave the line.

do u think it would be an ideal solution to just replace all \r\n with \n

tbh I'm not sure what / if we can to do something about the CLRF line endings. \r\n is the default windows line ending after all. I will leave this issue open until I have access to my windows machine again and can check further.

@AV3RG
Copy link
Author

AV3RG commented Dec 17, 2023

Before the input is passed onto the part_one and part_two functions we can just run a trim() and replace("\r\n", "\n"). This way none of the files need to be changed and the input parameter will neither contain the ending extra line nor have any regex issue

@fspoettel
Copy link
Owner

  • The final newline is present in puzzle input downloaded from the website (right+click -> download). I'm hesitant to alter it for this reason. We should save the input in the same state that it is provided on the website.
  • trim() is not safe in this case. If there is relevant whitespace at the end of the puzzle input, it gets remove as well.

I would appreciate if we can focus this issue on the CLRF / LF topic.

@AV3RG
Copy link
Author

AV3RG commented Dec 17, 2023

makes sense, so still for the CRLF / LF, I don't see any other option working for us immediately other than to replace\r\n with \n

@tguichaoua
Copy link
Contributor

Checkout str::lines()

@AV3RG
Copy link
Author

AV3RG commented Dec 17, 2023

The problem is the regex being used is for multiple lines/the entire input. Hence it also has \n in the regex being used itself. Using str::lines() will do no good in this case

@tguichaoua
Copy link
Contributor

You can also setup you IDE and git to use LF instead of CRLF

@fspoettel
Copy link
Owner

some further context on the regex crate & CRLF:

rust-lang/regex#244
https://docs.rs/regex/latest/regex/struct.RegexBuilder.html#method.crlf

@AV3RG
Copy link
Author

AV3RG commented Dec 18, 2023

Oh, I didn't know that existed, I just started learning rust with aoc this year. Thanks a lot. Sorry for the trouble

@AV3RG AV3RG closed this as completed Dec 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants