Fix mkdir race condition in LooseObjectDB.store #91
+1
−5
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #85
This replaces the conditional call to
os.mkdir
that raises an unintendedFileExistsError
if the directory is created between the check and theos.mkdir
call, using a singleos.makedirs
call instead, withexist_ok=True
.This way, we attempt creation in a way that produces no error if the directory is already present, while still raising
FileExistsError
if a non-directory filesystem entry (such as a regular file) is present where we want the directory to be. This is the advantage of this approach over the approach of swallowingFileExistError
as suggested in #85.Note, however, that
os.makedirs
behaves likemkdir -p
: it attempts to create parent directories (and their parents, etc.) if they do not already exist. So it should only be used if that is acceptable in this case. I am not aware of a reason it wouldn't be, but I am not very familiar with gitdb.So that aspect of the situation deserves special consideration in reviewing this PR. I'd be pleased to change the approach if
os.makdirs
is judged not suitable here. I think the approach suggested in #85 is reasonable, and it can be made more robust by checking that the directory exists after the creation attempt (or in other ways).The code was under test: that line is exercised in
TestExamples.test_base
,TestGitDB.test_writing
,TestLooseDB.test_basics
, andTestObjDBPerformance.test_large_data_streaming
. However, no test catches the race condition this fixes, and I have not added one.Testing that the race condition does not occur in the specific way as before by accessing and calling the same functions as before in the same order would be easy, but it would be more of an illusion of a regression test than a useful test. Testing by trying to brute-force a race condition, without modifying the operation of the code for the test, would work but the tests would take a very long time to run. Testing it in a way that is fairly robust against new ways of reintroducing the race condition and that is not too slow should be possible, but I don't know of a good way to do it; everything I've thought of would be complicated, and possibly make running the test in a debugger like
pdb
infeasible. So I have not added a regression test for this bug. However, if it is considered important to have one, then I can consider the matter further.