Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling missing lz4 terminator on graceless shutdown #110

Open
l4l opened this issue May 18, 2023 · 8 comments
Open

Handling missing lz4 terminator on graceless shutdown #110

l4l opened this issue May 18, 2023 · 8 comments

Comments

@l4l
Copy link

l4l commented May 18, 2023

I have long-living process that writes to a file with compression. It may happen that the process is been killed leading to a file not being flushed properly even with auto_finish (#95). Sometimes it may happen that the data has been written completely but only a lz4 ending (EndMark + Checksum) is missing. What's the preferable way of handling this case? Currently reading such block leading to error like "failed to fill whole buffer" which happens because of Read::read_exact.

@PSeitz
Copy link
Owner

PSeitz commented May 19, 2023

What is the signal to kill the process, does it allow graceful shutdown like SIGTERM or is it SIGKILL?

@l4l
Copy link
Author

l4l commented May 19, 2023

No, for the graceful one auto_finish should work fine. I'm considering the case where it's impossible to do that SIGKILL/SIGABRT/SIGSEGV etc.

@PSeitz
Copy link
Owner

PSeitz commented May 19, 2023

Oh I just saw it says that in the title already.

You probably want some atomic writes, where it writes to a temp file and then renames it to the final file after the last flush.

@l4l
Copy link
Author

l4l commented May 19, 2023

Not really, I already have a mechanism for dropping "corrupted" files. But still I want to try to recover a partly written file even after failure. In particularly, I'm interested in the case when all the data has been flushed but there's missing lz4 terminator. Thus ideally, I'd like to read all the data and then get a error like UnexpectedEof or similar.

@PSeitz
Copy link
Owner

PSeitz commented May 19, 2023

How would you know if it's only missing an lz4 terminator and not more?

How does it behave currently before it fails with "failed to fill whole buffer", e.g. does the underlying reader in FrameDecoder get parts of the corrupted block?

@l4l
Copy link
Author

l4l commented May 19, 2023

How would you know if it's only missing an lz4 terminator and not more?

There's no way of determining that one? I thought length of the block is stored somewhere in the meta as well.

How does it behave currently before it fails with "failed to fill whole buffer", e.g. does the underlying reader in FrameDecoder get parts of the corrupted block?

Didn't relly check the behavior of lz4-flex. I read and decode the whole file at once. Probably it fails at the last block but not really sure.

@PSeitz
Copy link
Owner

PSeitz commented May 19, 2023

There's no way of determining that one? I thought length of the block is stored somewhere in the meta as well.

Yes, each block stores it's length. The terminator marks the end of the blocks in the frame. It's a invalid format without it and failing decompression is fine as long as it doesn't panic.

@PSeitz
Copy link
Owner

PSeitz commented Jun 2, 2023

You could write multiple frames instead of a frame with multiple blocks. Then load the complete frames and discard the last corrupted one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants