Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

errror when comment character contained within CSV data #325

Closed
missinglink opened this issue Mar 1, 2022 · 1 comment
Closed

errror when comment character contained within CSV data #325

missinglink opened this issue Mar 1, 2022 · 1 comment

Comments

@missinglink
Copy link

missinglink commented Mar 1, 2022

Summary

Hi 馃憢 thanks for the great lib!

We are using the option { "comment": "#" } to remove a header section from the CSV file which contains multiple lines beginning with '#' (as per bash syntax).

Motivation

The issue we face is that the hash (#) character may also exist as a valid character within the body of some rows, this results in a fatal columns mismatch error.

For example:

# comment
# comment
col1,col2,col3
a,b,c
a,###,c

Alternative

My understanding of the documentation "Treat all the characters after this one as a comment" is that currently both infix and prefix matching are supported, which makes sense for lines like this a,b,c # this is a comment.

In my case at least I was caught out by this, as I assumed that the match was prefix only, I guess I was expecting it to only apply to lines which begin with the comment string (as per bash).

Draft

What I'd love to have is the ability to control whether this was applied as an infix match or only as a prefix.
For example, if I were able to supply a regular expression I could use ^# to 'anchor' the string at the beginning of the row.

Additional context

We're using the stream API, I wasn't able to find the exact places in the code where this is implemented, but presumably this is handled in a streaming fashion and so therefore may or may not have access to the newline, depending on where in the parser it is implemented.

If you'd like to point me to the places in the code which are relevant I might be able to draft a PR, although we'd need to discuss how best to change the JS API to allow users to configure whether infix matching was enabled or not.

@wdavidw
Copy link
Member

wdavidw commented Mar 2, 2022

Hi @missinglink, supporting regular expression is impossible. It would apply to the all record but to know what is a record, we need to parse the record because a record separator could be escaped or present inside a quoted field. However, with a comment, attempting to parse the record will legitimately end up as an error.

Not a big fan of introducing a new option but I don't have much other option to propose.

salceson added a commit to evidenceprime/node-csv that referenced this issue Sep 22, 2023
* chore: latest dependencies

* fix: uncaught errors with large stream chunks (fix adaltas#386)

* chore(release): publish

 - csv-demo-browser@0.1.6
 - csv-demo-cjs@0.2.4
 - csv-demo-eslint@0.1.10
 - csv-demo-esm@0.0.18
 - csv-issues-cjs@0.1.5
 - csv-issues-esm@0.0.9
 - csv-demo-ts-moduleresolution-node16-cjs@0.2.4
 - csv-demo-ts-module-node16@0.2.4
 - csv-demo-webpack-ts@0.1.6
 - csv-demo-webpack@0.1.8
 - csv-generate@4.2.3
 - csv-parse@5.3.7
 - csv-stringify@6.3.1
 - csv@6.2.9
 - stream-transform@3.2.3

* test(csv-stringify): fix legacy

* chore(release): publish

 - csv-demo-browser@0.1.7
 - csv-demo-cjs@0.2.5
 - csv-demo-eslint@0.1.11
 - csv-demo-esm@0.0.19
 - csv-issues-cjs@0.1.6
 - csv-issues-esm@0.0.10
 - csv-demo-ts-moduleresolution-node16-cjs@0.2.5
 - csv-demo-ts-module-node16@0.2.5
 - csv-demo-webpack-ts@0.1.7
 - csv-demo-webpack@0.1.9
 - csv-generate@4.2.4
 - csv-parse@5.3.8
 - csv-stringify@6.3.2
 - csv@6.2.10
 - stream-transform@3.2.4

* build: remove trailing slash in home url

* chore: latest dependencies

* fix(csv): fixed CJS types under modern `modernResolution` options (adaltas#388)

* fix(csv): remove ts files in cjs dist

* chore(release): publish

 - csv-demo-browser@0.1.8
 - csv-demo-cjs@0.2.6
 - csv-demo-eslint@0.1.12
 - csv-demo-esm@0.0.20
 - csv-issues-cjs@0.1.7
 - csv-issues-esm@0.0.11
 - csv-demo-ts-moduleresolution-node16-cjs@0.2.6
 - csv-demo-ts-module-node16@0.2.6
 - csv-demo-webpack-ts@0.1.8
 - csv-demo-webpack@0.1.10
 - csv-generate@4.2.5
 - csv-parse@5.3.9
 - csv-stringify@6.3.3
 - csv@6.2.11
 - stream-transform@3.2.5

* docs: minor upercase modification

* chore: latest dependencies

* chore(release): publish

 - csv-demo-browser@0.1.9
 - csv-demo-cjs@0.2.7
 - csv-demo-eslint@0.1.13
 - csv-demo-esm@0.0.21
 - csv-issues-cjs@0.1.8
 - csv-issues-esm@0.0.12
 - csv-demo-ts-moduleresolution-node16-cjs@0.2.7
 - csv-demo-ts-module-node16@0.2.7
 - csv-demo-webpack-ts@0.1.9
 - csv-demo-webpack@0.1.11
 - csv-generate@4.2.6
 - csv-parse@5.3.10
 - csv-stringify@6.3.4
 - csv@6.2.12
 - stream-transform@3.2.6

* feat: add unicode chars to formula escape (adaltas#387)

* fix(csv-stringify): use switch in formula escaping

* fix(csv-stringify): add unicode character equivalents in formula sanitization

* chore: update tests

* docs(csv-stringify): escape formulas references

* chore(release): publish

 - csv-demo-browser@0.1.10
 - csv-demo-cjs@0.2.8
 - csv-demo-eslint@0.1.14
 - csv-demo-esm@0.0.22
 - csv-issues-cjs@0.1.9
 - csv-issues-esm@0.0.13
 - csv-demo-ts-moduleresolution-node16-cjs@0.2.8
 - csv-demo-ts-module-node16@0.2.8
 - csv-demo-webpack-ts@0.1.10
 - csv-demo-webpack@0.1.12
 - csv-stringify@6.4.0
 - csv@6.3.0

* feat(csv-parse): add `columns` property in `Info` object type (adaltas#390)

* fix(ts): Add `columns` property in `Info` object type

* Add disabled options to columns type

* build(csv-parse): build and write test after info ts definition

* chore(release): publish

 - csv-demo-browser@0.1.11
 - csv-demo-cjs@0.2.9
 - csv-demo-esm@0.0.23
 - csv-issues-cjs@0.1.10
 - csv-issues-esm@0.0.14
 - csv-demo-ts-moduleresolution-node16-cjs@0.2.9
 - csv-demo-ts-module-node16@0.2.9
 - csv-demo-webpack-ts@0.1.11
 - csv-demo-webpack@0.1.13
 - csv-parse@5.4.0
 - csv@6.3.1

* docs: update build badge urls

* docs(csv-generate): comment indentation in samples

* refactor(csv-issues-cjs): code format

* refactor(csv-issues-cjs): remove unused arguments

* test(csv-issues-cjs): fix stdout maxBuffer length exceeded

* test(csv-issues-esm): use spawn instead of exec

* fix: commonjs types, run tsc and lint to validate changes (adaltas#397)

* fix: types weren't working for commonjs. Run tsc and lint to validate changes

* chore: needs to work on linux and BSD

* chore: latest dependencies

* chore(release): publish

 - csv-demo-browser@0.1.12
 - csv-demo-cjs@0.2.10
 - csv-demo-eslint@0.1.15
 - csv-demo-esm@0.0.24
 - csv-issues-cjs@0.1.11
 - csv-issues-esm@0.0.15
 - csv-demo-ts-moduleresolution-node16-cjs@0.2.10
 - csv-demo-ts-module-node16@0.2.10
 - csv-demo-webpack-ts@0.1.12
 - csv-demo-webpack@0.1.14
 - csv-generate@4.2.7
 - csv-parse@5.4.1
 - csv-stringify@6.4.1
 - csv@6.3.2
 - stream-transform@3.2.7

* feat(csv-issues-cjs): 399 issue

* fix(csv-demo-ts-cjs-node16): upgrade module definition after latest typescript

* feat(csv-parse): new comment_no_infix option (fix adaltas#325)

* test(csv-issues-esm): reproduce issue adaltas#391

* refactor(csv-stringify): rename variable in sample

* test(csv-issues-cjs): reproduce issue 327

* chore(release): publish

 - csv-demo-browser@0.1.13
 - csv-demo-cjs@0.2.11
 - csv-demo-eslint@0.1.16
 - csv-demo-esm@0.0.25
 - csv-issues-cjs@0.2.0
 - csv-issues-esm@0.0.16
 - csv-demo-ts-cjs-node16@0.2.11
 - csv-demo-ts-module-node16@0.2.11
 - csv-demo-webpack-ts@0.1.13
 - csv-demo-webpack@0.1.15
 - csv-generate@4.2.8
 - csv-parse@5.5.0
 - csv-stringify@6.4.2
 - csv@6.3.3
 - stream-transform@3.2.8

* docs(csv-parse): comment_no_infix sample

---------

Co-authored-by: David Worms <david@adaltas.com>
Co-authored-by: Petter <petter@petterhaggholm.net>
Co-authored-by: Mateusz Burzy艅ski <mateuszburzynski@gmail.com>
Co-authored-by: Tom Emelko <tom.emelko@gmail.com>
Co-authored-by: Elia Maino <eliamaino@gmail.com>
Co-authored-by: David Tanner <darthtanner@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants