Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SymbolicReference.set_reference may not roll back properly on error #1669

Closed
EliahKagan opened this issue Sep 21, 2023 · 7 comments · Fixed by #1675
Closed

SymbolicReference.set_reference may not roll back properly on error #1669

EliahKagan opened this issue Sep 21, 2023 · 7 comments · Fixed by #1675

Comments

@EliahKagan
Copy link
Contributor

EliahKagan commented Sep 21, 2023

SymbolicReference.set_reference contains this code, which is meant to attempt a transactional write:

ok = True
try:
fd.write(write_value.encode("utf-8") + b"\n")
lfd.commit()
ok = True
finally:
if not ok:
lfd.rollback()

However, because the ok flag is initialized to True instead of False--which strongly appears unintentional--it is always True even if the ok = True statement in the try block is not reached. Consequently, lfd.rollback() in the finally block is never called. This seems on first glance like a serious bug, but as detailed below I believe it might actually be quite minor in the context of what kind of cleanup GitPython generally does or does not guarantee.

This can be contrasted to another place where that pattern is used correctly:

GitPython/git/index/base.py

Lines 227 to 233 in a5a6464

ok = False
try:
self._serialize(stream, ignore_extension_data)
ok = True
finally:
if not ok:
lfd.rollback()

This is very simple to fix, since it is sufficient to change True to False in the initial assignment. So most of the work to fix it would be figuring out where a regression test should go and how it should manage to produce a case where cleanup doesn't happen immediately.

(Besides that simple fix, other fixes that also reduce the risk of such bugs coming back and allow the code to be simplified are possible. In particular, the LockedFD object represents a non-memory resource that always can and should be managed deterministically according to where control flow is, so it ought to implement the context manager protocol.)

I am unsure if this bug causes any problems in practice. If I understand LockedFD correctly, the write that is not rolled back is to a temporary file separate from the real target, accessed through a file object specific to that LockedFD object. The lock file name is determined from the target file name, so if a separate attempt to write to the target is made, that could cause problems. LockedFD has a __del__ method that should be called eventually and performs the rollback. On CPython, which has reference counting as its primary garbage collection strategy, cleanup would often happen immediately as a result... though not always, even in the absence of reference cycles, because, in exception handling, an exception object holds a reference to a traceback object which holds references to all stack frames in its call stack, through which local variables thus remain accessible until control leaves the handling except block.

Still, it may be that this shall not be considered a bug--or at least not a bug of its own, or at least not a bug that merits a regression test accompanying its bugfix--on the grounds that it might be considered a special case of the Leakage of System Resources limitation documented prominently in the readme. In support of this view, the third place in the code of GitPython where ldf.rollback is called uses a different pattern and its comments suggest that cleanup via LockedFD.__del__ is considered acceptable to rely on:

GitPython/git/refs/log.py

Lines 261 to 268 in a5a6464

fp = lfd.open(write=True, stream=True)
try:
self._serialize(fp)
lfd.commit()
except Exception:
# on failure it rolls back automatically, but we make it clear
lfd.rollback()
raise

(Interestingly, that is not exactly equivalent to the behavior of the try-finally pattern used in git/index/base.py. If it is intended to be, then it should catch BaseException rather than Exception.)

@Byron
Copy link
Member

Byron commented Sep 21, 2023

This is a great catch, thanks for documenting your findings!

Even though having more tests for resource-leakage (or the lack thereof due to correct implementation) would be valuable, I dare to ask if it's possible to fix the usage of LockedFD instead, and have good test-coverage for its API. LockedFD begs to be a context manager.

Thinking further, all implementations of __del__ would need to be banned in favour of modern and actually working mechanisms, like ContextManagers. If done, the resource leakage issues of GitPython would probably be resolved for most of the current cases (but definitely not all of them).

EliahKagan added a commit to EliahKagan/GitPython that referenced this issue Sep 21, 2023
This fixes the initialization of the "ok" flag by setting it to
False before the operation that might fail is attempted.

(This is a fix for the *specific* problem reported in gitpython-developers#1669 about
how the pattern SymbolicReference.set_reference attempts to use
with LockedFD is not correctly implemented.)
@EliahKagan
Copy link
Contributor Author

I'm pleased to hear that turning classes with __del__ into context managers and using them as such could mostly solve that long-standing limitation. I didn't realize this--I had expected that this would be infeasible due to the way applications and other libraries themselves use GitPython.

I've opened a PR, #1675, that does the much more narrow change of (1) fixing the specific bug described here that arises from the incorrectly used flag variable, and also (2) doing some refactoring to make it and those like it less likely and more generally to make finally logic clearer. This PR is sort of underwhelming, but it's the part of the work I had already done while working on #1673 that I decided was not really part of the topic of that PR. Based on the more ambitious suggestions you've given here, I had considered describing the PR here and asking if I should still open it, but it's easier to review a PR that exists than one that doesn't, so I figured you could approve, reject, or request changes on it rather than guessing.

(I'm not sure how best to proceed in this issue, but my inclination would be to treat this issue as specifically about SymbolicReference.set_reference and open a new issue about the broader problem of not having and using context managers in most places where they should be used.)

@Byron
Copy link
Member

Byron commented Sep 21, 2023

I didn't realize this--I had expected that this would be infeasible due to the way applications and other libraries themselves use GitPython.

There definitely is more subtle cases of leakage, but this one is particularly low-hanging I think.

(I'm not sure how best to proceed in this issue, but my inclination would be to treat this issue as specifically about SymbolicReference.set_reference and open a new issue about the broader problem of not having and using context managers in most places where they should be used.)

Yes, I agree - this issue can stay and the way I see it will be closed very soon anyway. If you choose to go ahead and fix __del__ I think it's fair to make it a PR right away, foregoing the issue. To me PRs are like issues, with code attached.

@Byron Byron added this to the v3.1.37 - Bugfixes milestone Sep 21, 2023
@EliahKagan
Copy link
Contributor Author

If you choose to go ahead and fix __del__ I think it's fair to make it a PR right away

If done currently, it would be multiple PRs across multiple repositories, right? gitdb has classes with __del__, which I think would need to gain __enter__ and __exit__ methods that would be called in the __enter__ and __exit__ methods in some classes in GitPython; and smmap has classes with __del__, which I think would need to gain __enter__ and __exit__ methods to be called from those in gitdb.

I am myself unlikely to do this right away: I would at minimum want to add native Windows testing to CI before beginning on it, and there are some static analysis tools that may help find places where something can be used as a context manager but isn't, once that thing is actually a context manager.

@Byron
Copy link
Member

Byron commented Sep 22, 2023

Oh, right! I totally forgot about gitdb 😅. Yes, no matter how many PRs there will be in the end, it would at least be two, one for each involved repository.

I am myself unlikely to do this right away: I would at minimum want to add native Windows testing to CI before beginning on it, and there are some static analysis tools that may help find places where something can be used as a context manager but isn't, once that thing is actually a context manager.

That definitely makes perfect sense. I wonder if it would also help to bring gitdb and smmap into the GitPython repository. Then work that belongs together but spans multiple python packages could be done atomically. Having that would probably also help with releases later. (And like mentioned before, while the test suite isn't isolated, it's fine to keep submodules to archived repositories).

These are just loose thoughts though, you should do as you see fit.

@EliahKagan
Copy link
Contributor Author

I was reminded of this by the discussion in #1767, and I am curious: should attention to the long-idle gitpython-developers/smmap#33 be renewed at some point, with the hope of merging it (perhaps with changes to account for the multiple versions through which smmap has gone since it was last updated)? If context managers are going to be added to the types that currently do all or most of their cleanup via __del__, it seems to me that it might make sense to begin at the leaf in the call/dependency graph and work back from there, so that each class that is made a context manager can, when appropriate, use the context manager facilities of the code it is already using.

This also seems like something that should be considered before an attempt is made to move the gitdb and smmap repositories into the GitPython repository, since I don't believe pull requests can be effectively transferred. (I just mean I think GitHub PRs cannot be made to reflect that. Obviously commits from any branch of any repository can be merged into, or rebased onto, any branch of any repository.)

Of course I understand that, even if and when such a thing were to be done, all the __del__ stuff would have to remain in place as a fallback to support the large amount of existing code using GitPython.

@Byron
Copy link
Member

Byron commented Dec 13, 2023

I was reminded of this by the discussion in #1767, and I am curious: should attention to the long-idle gitpython-developers/smmap#33 be renewed at some point, with the hope of merging it (perhaps with changes to account for the multiple versions through which smmap has gone since it was last updated)? If context managers are going to be added to the types that currently do all or most of their cleanup via __del__, it seems to me that it might make sense to begin at the leaf in the call/dependency graph and work back from there, so that each class that is made a context manager can, when appropriate, use the context manager facilities of the code it is already using.

That is a lovely idea, and a method that would be required to make the use of context managers work at all. Thanks for pointing me at the indeed long-stale smmap PR, I totally forgot about it, even though it looks like it could help. In practice though, gitdb isn't use for object access at all. This kind of reminds me that ideally, we even consider getting rid of it along with a breaking change, after all, that solves the 'have a monorepo' problem along with many others towards a GitPython that is actually 'sound'.

Of course I understand that, even if and when such a thing were to be done, all the __del__ stuff would have to remain in place as a fallback to support the large amount of existing code using GitPython.

Yes, that's a valid point and it's very true as well. Maybe, if a breaking change is considered to get rid of gitdb -> smmap, and if the use of context managers is as easy as with repo and if python can automatically warn or fail if a context manager is used without a 'context' (i.e. without with), then one could actually consider to start a 'correctness' initiative where GitPython properly cleans up after itself in a timely manner.

It seems like two birds one stone to me, especially if migrating to existing code is effectively simple. Of course, this can be simplified by allowing people to keep using the __del__ based code for a while with deprecation warnings, and to allow transitioning.

I am quite intrigued about that idea!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

2 participants