Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

write_byte_record and write_field does not mix well and this is not properly documented. #335

Open
vi opened this issue Sep 17, 2023 · 1 comment

Comments

@vi
Copy link

vi commented Sep 17, 2023

What version of the csv crate are you using?

1.2.2

Briefly describe the question, bug or feature request.

When I use write_byte_record with a non-empty iterator after prior series of write_fields, the library produces strange output and may fail with UnequalLengths.

Include a complete program demonstrating a problem.

fn main() -> Result<(), Box<dyn std::error::Error>>{
    let mut b = csv::WriterBuilder::new();
    b.has_headers(false);
    let mut csv = b.from_path("out.csv")?;
    let mut record = csv::ByteRecord::new();
    record.push_field(b"12");
    for _ in 0..10000 {
        csv.write_field("F")?;
        csv.write_byte_record(&record)?;
    }
    Ok(())
}

I used similar code (with a csv.write_field("")? workaround to insert the missing comma), thinking that write_record is only for string records, not byte ones.

What is the observed behavior of the code above?

Output file contains records glued together (except of the last line before UnequalLengths error).

The documentation does not suggest, but also does not explicitly prohibit this combination of csv::Writer methods.

What is the expected or desired behavior of the code above?

write_byte_record's documentation explicitly mentions that it write_field should not be used to prepend fields. Or the program behaves just like as if it were write_record instead.

Maybe usage of write_byte_record after write_field should panic, at least in debug profile.


Here are steps of how similar code can end up in a project:

  1. Example code from write_field's documentation with wtr.write_record(None::<&[u8]>)?; (why there is no dedicated method to avoid those turbofishes?);
  2. None::<&[u8]> gets replaced with a record from other CSV file (e.g. to round trip with prepended fields).
  3. Let's preserve non-UTF-8 content, so ByteRecord instead of StringRecord. As we have switched to byte records, assume (erroneously) that we need to switch to write_byte_record from write_record. It also mentions "more quickly" in the docs, which makes the switch even more attractive.
@BurntSushi
Copy link
Owner

Thanks for the well written issue!

Or the program behaves just like as if it were write_record instead.

This is probably what should happen. I don't have the context paged into cache to know how tricky (if at all) this is. Hopefully not tricky at all.

Example code from write_field's documentation with wtr.write_record(None::<&[u8]>)?; (why there is no dedicated method to avoid those turbofishes?);

Because I perceive this to be a niche/weird case and I'm not sure it warrants another method in the public API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants