Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DicomFile.Open with ReadAll fails for files larger than Int.Max bytes long #1453

Closed
rprouse opened this issue Sep 23, 2022 · 11 comments · Fixed by #1626
Closed

DicomFile.Open with ReadAll fails for files larger than Int.Max bytes long #1453

rprouse opened this issue Sep 23, 2022 · 11 comments · Fixed by #1626
Labels

Comments

@rprouse
Copy link

rprouse commented Sep 23, 2022

Describe the bug
We have modalities sending large Dicom files that are over Int.Max bytes long, an example that I unfortunately can't share is 2,359,297,856 bytes long. When we attempt to open them with FileReadOption.ReadAll, it crashes with an inner exception, "ArgumentOutOfRangeException: Non-negative number required. Parameter name: count" This is happening more as modalities are producing higher resolution imaging.

This exception was originally thrown at this call stack:
    System.IO.BinaryReader.ReadBytes(int)
    FellowOakDicom.IO.FileByteSource.GetBuffer(uint)
    FellowOakDicom.IO.Reader.DicomReader.DicomReaderWorker.ParseValue(FellowOakDicom.IO.IByteSource)
    FellowOakDicom.IO.Reader.DicomReader.DicomReaderWorker.ParseDataset(FellowOakDicom.IO.IByteSource)
    FellowOakDicom.IO.Reader.DicomReader.DicomReaderWorker.DoWork(FellowOakDicom.IO.IByteSource)
    FellowOakDicom.IO.Reader.DicomFileReader.DoParse(FellowOakDicom.IO.IByteSource, FellowOakDicom.IO.Reader.IDicomReaderObserver, FellowOakDicom.IO.Reader.IDicomReaderObserver, System.Func<FellowOakDicom.ParseState, bool>, ref FellowOakDicom.DicomTransferSyntax, ref FellowOakDicom.DicomFileFormat)
    FellowOakDicom.IO.Reader.DicomFileReader.Parse(FellowOakDicom.IO.IByteSource, FellowOakDicom.IO.Reader.IDicomReaderObserver, FellowOakDicom.IO.Reader.IDicomReaderObserver, System.Func<FellowOakDicom.ParseState, bool>)
    FellowOakDicom.IO.Reader.DicomFileReader.Read(FellowOakDicom.IO.IByteSource, FellowOakDicom.IO.Reader.IDicomReaderObserver, FellowOakDicom.IO.Reader.IDicomReaderObserver, System.Func<FellowOakDicom.ParseState, bool>)
    FellowOakDicom.DicomFile.Open(string, System.Text.Encoding, System.Func<FellowOakDicom.ParseState, bool>, FellowOakDicom.FileReadOption, int)

The issue is that in FileByteSource.GetBuffer(uint count) the number of bytes to read is passed in as a uint but BinaryReader.ReadBytes(int count) takes an int. When the uint is cast to an int it goes negative. To fix, if (count > Int.Max), we should read into the MemoryByteBuffer in two passes. Longer term, as files get larger, we should probably pass a long into GetBuffer and loop appropriately. The other methods like Skip are taking a uint and passing it to Seek that takes a long, so those are fine.

To Reproduce
DicomFile.Open(Filename, FellowOakDicom.FileReadOption.ReadAll); with a file over 2.2 GB. Will also happen if any dataset is more than 2.2 GB.

Expected behavior
No exception is thrown

Screenshots or test DICOM files
I will see if I can get a Dicom file I can share without private data but will need to work with the customer.

Environment
Fellow Oak DICOM version: 5.0.3
OS: Windows 10 x64
Platform: .NET Framework 4.8

@rprouse rprouse added the new label Sep 23, 2022
@rprouse
Copy link
Author

rprouse commented Sep 23, 2022

Looking at the code, the other implementations of IByteSource will also need updating.

@gofal gofal added bug and removed new labels Oct 2, 2022
@celeron533
Copy link
Contributor

Similar to #1163 , 2 GB read/write issue

@mrbean-bremen
Copy link
Collaborator

@rprouse - looks like you know how to fix this - do you want to create a PR?

@rprouse
Copy link
Author

rprouse commented Oct 14, 2022

I know how to fix it and started a PR, but I was unable to dynamically create a Dicom file big enough to add a proper unit test and I'm not going to add a 2GB file to the repo.

I can test locally against files I have. Would that be good enough for you? Or do you know how I can use the library to create a big enough file to use in a unit test?

@mrbean-bremen
Copy link
Collaborator

I don't have a good idea right now, but maybe somebody else does. I would suggest you submit the PR without a test (given that you tested it locally we can be reasonable sure that the fix does what is should), and we can discuss this in the PR. I personally would be fine with it even if we don't find a way to write a suitable test.

@rprouse
Copy link
Author

rprouse commented Oct 17, 2022

I gave this a pass tonight but ran into the issue that the max size of a byte array in C# is int.MaxInt elements long which I'd forgotten about, so reading the data in in chunks does not help because I cannot create a MemoryByteBuffer large enough to hold the data. I'm going to need to step back and think about this one.

@amoerie
Copy link
Collaborator

amoerie commented Dec 7, 2022

@rprouse

Would it be correct to say that the issue is that a single DICOM tag value (presumably PixelData) exceeds 2GB?
In that case, rather than using a MemoryByteBuffer, the parser could return a TemporaryFileBuffer (which Windows will try to keep in memory if possible)

I could also be of assistance to have a large DICOM file. We could commit a black image with huge resolution in JPEG Lossless. The compressed file would be small because all pixels are black, but when converted to Explicit VR the raw image would be huge.

In any case, if you want some help with this, ping me.

@rprouse
Copy link
Author

rprouse commented Dec 7, 2022

It is the pixel data. Problem is that I am not sure how much time I will have to work on it. We are using DCMTK. I found this issue evaluating this library but we are not using it. And I'm falling behind maintaining my own projects like NUnit.

@amoerie
Copy link
Collaborator

amoerie commented Dec 7, 2022

It is the pixel data. Problem is that I am not sure how much time I will have to work on it. We are using DCMTK. I found this issue evaluating this library but we are not using it. And I'm falling behind maintaining my own projects like NUnit.

Don't worry about it, thanks for letting us know about this issue then.
Best of luck with NUnit!

@amoerie
Copy link
Collaborator

amoerie commented Jul 13, 2023

A little update on this. I can confirm that Fellow Oak DICOM successfully saves & opens a DICOM file that is >4GB in size.
While the file itself can be very large, a single DICOM item is not allowed to exceed uint.MaxValue in length (which is 4GB)
So if you have pixel data that is larger than 4 GB, you must split up the data in fragments, where each fragment is smaller than 4GB.

The bug in Fellow Oak DICOM was that, due to using int in FileByteSource, we only supported fragments (or just a single Pixel Data item) up to 2 GB instead of 4 GB.

I've added a unit test that generates a >4GB DICOM file on the fly, with two fragments larger than 2GB, saves it to a temporary file and loads it again. This works now.

@bcarthic
Copy link

@amoerie, @gofal Is there a way to open a DICOM file using unseekable stream? I see UnseekableStreamByteSource in the source code but wasn't sure if this was supported. Today, we have to download the entire file in memory to do any processing, especially for transcoding. If we have to transcode a frame then we have to create a dataset and then do transcode. I was looking for better option where consume too much memory. Any insights would be really helful. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants