Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue and test deprecation warnings #1886

Merged
merged 89 commits into from Mar 31, 2024

Conversation

EliahKagan
Copy link
Contributor

@EliahKagan EliahKagan commented Mar 30, 2024

Scope

Some deprecations in GitPython had no associated DeprecationWarning, and none had unit tests to verify that the warnings were issued under the intended circumstances. This adds...

  • warnings,
  • tests of the warnings, and
  • where applicable, tests for subtle problems that can arise from adding the warnings as new runtime behavior on attribute access (see below for details)

...for everything in GitPython that is documented as deprecated in comments and/or docstrings, except where a DeprecationWarning is intentionally not issued either because some other kind of warning is emitted or because there is a specific reason not to do so:

  • Deprecated attributes of the top-level git module that are listed in __all__ for compatibility intentionally do not warn because it would lead to developers silencing warnings if they don't want to see them when exploring in a REPL and using a wildcard import. (Deprecated attributes of git that are not listed in __all__ do warn.)
  • Setting the HIDE_WINDOWS_KNOWN_ERRORS and HIDE_WINDOWS_FREEZE_ERRORS environment variables is deprecated, as noted in git/util.py, but this is expected to be entirely a user behavior rather than a behavior of code calling GitPython, and as such it already emits a warning with the logging framework. It may be worthwhile to test for this warning, but hopefully that code can simply be removed soon (and its existing tests thinned out further) by making the test suite entirely responsible for any special test-related rmtree behavior (#790).
  • Passing $var, ${var}, and or Windows %var% notation in paths to expand environment variables is deprecated since #662. The current behavior is to try to issue UserWarning (which is seen in more situations than DeprecationWarning) except when expand_vars is available and False has been passed for it (suppressing the expansion). Besides that these are not DeprecationWarnings, I consider this to be its own issue, as there are design decisions to be made about multiple aspects of it, which should probably be made together.

This also intentionally causes three previously accepted usages to be treated as static type errors:

  • The way the deprecation warning is added for git.types.Lit_commit_ish is selected to also make it always a type error to appear in any annotation. Due to #1858 and #1859, it is implausible that any annotation with Lit_commit_ish has been, or is currently, correct. (It is nonetheless not removed, since doing so could cause new exceptions at runtime, including in working code.) Note that this does not affect anything else in git.types; other types there, including Commit_ish, can be used correctly, are not deprecated, and continue to work, including in annotations.
  • Although mypy does not consider nonpublic attribute access a type error or attempt to detect it, pyright does, and the util attribute of the top-level git module is now identified as nonpublic. The git.util module (from git.util import X) is of course public, but due to the effect of a past wildcard import and the continued need to avoid breaking code outside GitPython that my have ended up depending on it, the util attribute of git actually aliases the git.index.util module. Because util has never been listed in git.__all__ (including when it had been dynamically generated), it is effectively nonpublic, but pyright treated it as public based on being temporarily set as the git.util module before being set to git.index.util. It is retained in dir() and its deprecation warning notes how it is a special case, but it is now seen as nonpublic.
  • Some other aliases to modules that are not direct Python submodules of the top-level git module had in the past been created by wildcard imports. These are still accessible, for now, but since there has never been a reason for anyone to consider those aliases part of GitPython's public interface, they are now regarded as absent (and an error to access) by static type checkers. They are also now omitted from dir(), though this is to avoid confusion with actual direct Python submodules of git much more than to discourage their use.

Other typing-related efforts here are of the opposite kind: making sure things don't break or change significantly.

Two other notable aspects of the scope:

  • The changes made to provide the runtime behavior of issuing DeprecationWarning when deprecated module-level attributes are accessed is, by its nature, immediately ready to also cover the case of a dynamically computed git.__version__ attribute, as discussed in #1716 (comment). See also this exploration of approaches to that. But to avoid bloating the scope, that change is not included in this PR; instead, a comment notes where that logic would go.

  • Building further on #1782 and #1787, I expanded the Git.USE_SHELL docstring to better explain the deprecation and to be more helpful in presenting the issues and what to do instead of setting it to True. However, I suspect it may benefit from some further expansion soon to clarify the risks (though maybe I can figure out a way to remove or abridge some parts so it doesn't keep growing). I regard the USE_SHELL deprecation to be core to this whole PR, because I think of everything deprecated in GitPython that it is where the greatest risk of usage that is unaware of the disadvantages may be.

    It is also a class attribute, which is naturally capable of being read and written on the Git class as well as read (but not written) through Git instances. Making class attribute accesses issue warnings without breaking anything else is nontrivial, and while the approach I have taken avoids most possible pitfalls, some potentially remain. But I believe the value of surfacing this deprecation and its rationale is sufficient to justify it, both in and of itself, and when considering the effect that issuing other deprecation warnings but not this one would have in creating the false appearance that this deprecation is less important (when really it is more). Further details on this are below, and also in the tests.

Avoiding subtle problems

In many places, issuing a warning if it was not already issued, and testing it, were straightforward, because the warning should occur exactly when code in the body of some function runs, so the code to issue the warning can simply be added there, usually at the top of the function. The case of git.util.Iterable is less basic, involving a metaclass for the deprecated class, but that was already implemented, so all it needed were unit tests where derived classes were locally defined while capturing and verifying the expected DeprecationWarning. Furthermore, some deprecated attribute accesses were basic cases, because they were already properties, so where warnings were absent it was sufficient to warn in the already existing property accessor.

In contrast, on attribute accesses that are not already properties or otherwise dynamically customized, adding new runtime behavior, even just to issue a warning, should be done carefully--if at all, but see below on rationale--because there are three kinds of things it can break, that can easily go undetected unless tested for explicitly:

  • Access to the attribute in less common ways that ought to work for the sake of correctness or practicality. Access that should not work, such as writing a read-only attribute, should fail with the same exception as before and the same or a similar message, and with no state change.

  • Runtime metadata. For example, some ways of customizing attribute access cause the attribute not to appear in dir(), which is often unintentional, and some ways of attempting to fix that cause other conceptually unrelated attributes not to appear in dir().

    Another example is generated documentation, which is sometimes okay to change but should remain useful: Sphinx must find and use docstrings, and generated help from the help builtin (or pydoc) should at minimum not stop indicating that an attribute is deprecated as a consequence of changes made to issue a warning. When possible, attributes with an important default value that is shown by help() should continue to show it.

  • Static typing. Static type checkers such as mypy and pyright treat __getattr__ and __getattribute__ methods to mean that arbitrary attributes are permitted. This is often good, since for example it makes it so code accessing dynamic callable attributes of Git instances is not analyzed as having type errors. But when such a method is added where fully dynamic attribute lookup is not conceptually correct, if done in view of the type checker (i.e., not hidden behind if not TYPE_CHECKING), it causes attempts to use misspelled or otherwise bogus attributes to be analyzed as correct and returning the Any type, on which subsequent operations also are treated as Any and not warned about, and so on.

    The other subtlety about static typing is the distinction between analyzing GitPython and analyzing another codebase that uses GitPython. For example, #1656 was not observed by GitPython's own static analysis because the project only uses mypy internally but both mypy and pyright are popular static type checkers; and #1349, which happens with mypy, nonetheless was not internally observed because of differences between running mypy on GitPython and running it on another project that uses GitPython (which partly have to do with configuration).

The first and second of those points are covered in detail by substantial regression tests that engage with a number of their ramifications. The tests have docstrings and I believe they present the considerations, as well as why those considerations led to the specific implementation approaches used, clearly. But there are three exceptions to this:

  • The effect on Sphinx-generated documentation and documentation displayed by help() I checked manually instead.
  • Likewise, I manually checked exception messages, but this was based on a judgment that these were low risk in the design approach I took, where either the messages where written explicitly (in module-level __getattr__) or were delegated to Python through super() in overridden special methods. Because the implementation approach might change in the future, which is something the new tests usually try to accommodate, I do not consider the omission of exception-message assertions to be ideal. But the messages are not guaranteed to be the same in every version of Python and do change in some Python releases.
  • The changes required to issue warnings on all accesses to Git.USE_SHELL are significant, and probably only justified because of the special value of issuing those particular warnings. (Their rationale is examined below.)

In view of the importance of maintaining robust and accurate static type checking at the boundary of GitPython and code that uses it, I took the somewhat unusual approach of developing many of the tests in a separate project first while experimenting with and implementing some changes on my feature branch of GitPython. Then I used git-filter-repo and a couple of rebases to include those commits in my GitPython feature branch, broken up into a few chunks, and interleaved with the changes they test, in what seemed like the most honest order.

Although CI results seem to be interesting on each commit and could be viewed in my fork in case interest arises, for those parts I'm pushing only the tips of each chunk, because there are many commits. Doing it that way also seems like it should make it easier to see where the chunks are.

The new tests should be type-checked within the GitPython project in the same way the code of the git module is type-checked (whatever that is, i.e., I am not advocating that continue-on-error be removed from the mypy step of pythonpackage.yml at this time). To facilitate this, I have put them in their own directory within test/, fully type-annotated them, configured mypy to be runnable with no arguments and scan both git/ and test/deprecation/, and and modified the pythonpackage.yml CI workflow, tox.ini, and README.md with that simpler yet now slightly broader command.

This has the potential eventual disadvantage that splitting up tests of the same units of code into multiple test modules could be unwieldy, if many more new test subpackages alongside test.performance and test.deprecation end up popping up for separate unrelated reasons in the future. However, it has the advantage that all deprecations except those for which no DeprecationWarning is issued for some special reason (see above) can now be efficiently discerned by examining those tests. For this reason I think this is justified at the moment and potentially even permanently.

Running mypy in GitPython shows the same five errors as it did as of the reduction in #1859, none of which are in areas sufficiently related to anything touched here that they would stand to obscure problems introduced here.

Rationale and design considerations of USE_SHELL warnings

I believe that what I have done for Git.USE_SHELL requires a specific rationale. The reason is that, to issue the deprecation warnings, I have caused the Git class to have a custom metaclass.

On the one hand, this should probably not break anything. As noted in the test_git_cmd.py module docstring, this could potentially break existing code by causing a metaclass conflict, if code that uses GitPython is not only inheriting from the Git class, and not only using it in multiple inheritance, but using it in multiple inheritance with at least one sibling class that has its own unrelated custom metaclass. The narrow considerations are:

  • This is arguably plausible, since abc.ABC uses the custom metaclass abc.ABCMeta and therefore all classes derived from abstract base classes have it, including concrete leaves in such a hierarchy. Also, explicitly naming a protocol as a base class also entails having a custom metaclass. (Satisfying a protocol in the usual way by having all the appropriate members of course does not entail gaining a custom metaclass.)
  • However, this seems like a fairly low risk, especially considering the propensity for interference between the effectively unlimited number of dynamic "methods" of Git instances, and the functionality of other classes in multiple inheritance. I would question what successful applications there are of multiple inheritance with a custom metaclass and Git as a sibling.
  • To be clear, the issue is not strictly that it breaks multiple inheritance with other unrelated custom metaclasses, but rather that it would be a breaking change to other such code that already exists, because that code would have to be modified to define and use a new metaclass that inherits from both metaclasses.
  • It seems to me that it may be justified based on the idea you have expressed there which, if I understand it correctly, entails at least that it is not of paramount importance to support unusual and inherently brittle forms of inheritance from Git that are not known or believed to be in current use.

But the broader consideration is... how can this possibly be? How can it be necessary to use a metaclass just to add a side effect to class attribute access?

Intuitively it feels like a descriptor, in the Git class, could do it. But a descriptor can only customize:

  • All forms of access (getting, setting, and deleting) on an instance of the class it is placed in.
  • Only reading on the class itself that it is placed in.

This is to say that:

  • __get__ (not to be confused with __getattr__ or __getattribute__) is called for both instance and class getattr operations (its instance parameter may be None).
  • __set__ (not to be confused with __setattr__) is called only for instance setattr operations; getting the attribute from the class does not call __set__ but simply replaces the descriptor object.
  • __delete__ (not to be confused with __delattr__ or __del__) is analogous to __set__, being called only for instance delattr operations; deleting the attribute from the class does not call __delete__ but simply removes the attribute (dropping the reference to the descriptor object).

Therefore, neither custom descriptors, nor properties (due to property objects being descriptors) will do this for us in the Git class itself.

Likewise, __getattr__, __getattribute__, and __setattr__ won't do it, because they only ever customize attribute access on instances of the class they are defined in. (This differs from __getattr__'s use in a module, where it customizes access in that module.)

More precisely, none of these things will do it by themselves. They can warn on access to the USE_SHELL class attribute through instances (which I have done, with __getattribute__ in Git). They could even warn on accesses by reading it from the class; but not about the most dangerous operation most meriting a warning, which is setting it on the class.

While the unit tests provide protection against technical pitfalls, as well as regression protection if the implementation strategy is changed in the future without accounting for all subtleties, they fundamentally cannot answer the underlying design question of whether the importance of warning about USE_SHELL is sufficient to justify the implementation. As I note in the test_cmd_git.py docstring:

The tests in this module do not establish whether the danger of setting Git.USE_SHELL to
True is high enough, and applications of deriving from Git and using an unrelated custom
metaclass marginal enough, to justify introducing a metaclass to issue the warnings.

As mentioned above and elsewhere, I believe it is justified. However, I am willing to make changes, including if necessary removing _GitMeta altogether.

Assuming it is to be kept, a remaining question is whether it is better or worse to provide a public alias to the metaclass. It can of course be gotten with type(Git), but static type checkers consider such an expression anywhere in any type annotation to be an error. An alias does not commit Git to continue to have a custom metaclass, because it can be documented as aliasing its metaclass whether that metaclass is type or a custom metaclass. I've done that. I've made the tests for that alias, and the code of the alias and its docstring itself (which cautions against writing code that would need it as well as against using it in a type annotation when Type[Git] is meant), be the last two commits here. That is so they can easily be dropped or reverted if you think having that public alias is not an improvement.

This starts on a test.deprecation subpackage for deprecation tests.
Having these tests in a separate directory inside the test suite
may or may not be how they will ultimately be orgnaized, but it has
two advantages:

- Once all the tests are written, it should be easy to see what in
  GitPython is deprecated.

- Some deprecation warnings -- those on module or class attribute
  access -- will require the introduction of new dynamic behavior,
  and thus run the risk of breaking static type checking. So that
  should be checked for, where applicable. But currently the test
  suite has no type annotations and is not checked by mypy. Having
  deprecation-related tests under the same path will make it easier
  to enable mypy for just this part of the test suite (for now).

It is also for this latter reason that the one test so far is
written without using the GitPython test suite's existing fixtures
whose uses are harder to annotate. This may be changed if
warranted, though some of the more complex deprecation-related
tests may benefit from being written as pure pytest tests.

Although a number of deprecated features in GitPython do already
issue warnings, Diff.renamed is one of the features that does not
yet do so. So the newly introduced test will fail until that is
fixed in the next commit.
The newly introduced test would pass, but show:

    Exception ignored in: <function Popen.__del__ at 0x000002483DE14900>
    Traceback (most recent call last):
      File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.12_3.12.752.0_x64__qbz5n2kfra8p0\Lib\subprocess.py", line 1130, in __del__
        self._internal_poll(_deadstate=_maxsize)
      File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.12_3.12.752.0_x64__qbz5n2kfra8p0\Lib\subprocess.py", line 1575, in _internal_poll
        if _WaitForSingleObject(self._handle, 0) == _WAIT_OBJECT_0:
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    OSError: [WinError 6] The handle is invalid

This was due to how, at least for now, the deprecation warning
tests are not using GitPython's normal fixtures, which help clean
things up on Windows.

This adds a small amount of ad-hoc cleanup. It also moves the
acquisition of the Diff object into a pytest fixture, which can be
reused for the immediately forthcoming test that the preferred
property does not warn.
This will be even more helpful when testing a deprecated member of
the Commit class (not yet done).
+ Remove the TODO suggesting the diff not be computed from a real
  commit, since we're going to have the logic for that in use in
  forthcoming commit-specific deprecation warning tests anyway.
And that the non-deprecated recommended alternative trailers_list
and trailers_dict properties do not warn.

The test that trailers warns does not yet pass yet, because it has
not yet been made to warn.
Python warnings are exceptions, to facilitate converting them to
errors by raising them. Thus the more specific :exc: role applies
and is a better fit than the more general :class: role.
And that subclassing the strongly preferred git.util.Iterable class
does not warn.
The code this replaces in the `commit` pytest fixture seems not to
be needed anymore, and could arguably be removed now with no
further related changes. But whether it is needed depends on
subtle and sometimes nondeterministic factors, and may also vary
across Python versions.

To be safe, this replaces it with a call to the Repo instance's
close method, which is in effect more robust than what was being
done before, as it calls clear_cache on the the Git object that the
Repo object uses, does a gitdb/smmap collection, and on Windows
calls gc.collect both before and after that collection.

This may happen immediately anyway if the Repo object is not
reachable from any cycle, since the reference count should go to
zero after each of the deprecation warning tests (the fixture's
lifetime is that of the test case), and Repo.close is called in
Repo.__del__. But this makes it happen immediately even if there is
a cycle.
The other deprecation test modules this refers to don't exist yet
but will be introduced soon.
- Add type hints to test/deprecation/basic.py. As its module
  docstring (already) notes, that test module does not contain
  code where it is specifically important that it be type checked
  to verify important properties of the code under test. However,
  other test.deprecation.* modules will, and it is much more
  convenient to be able to scan the whole directory than the
  directory except for one file. (Less importantly, for the
  deprecation tests to be easily readable as a coherent whole, it
  makes sense that all, not just most, would have annotations.)

- Configure mypy in pyproject.toml so it can be run without
  arguments (mypy errors when run that way unless configured),
  where the effect is to scan the git/ directory, as was commonly
  done before, as well as the test/deprecation/ directory.

- Change the CI test workflow, as well as tox.ini, to run mypy with
  no arguments instead of passing `-p git`, so that it will scan
  both the git package as it had before and the test.deprecation
  package (due to the new pyproject.toml configuration).

- Change the readme to recommend running it that way, too.
Rather than only testing attribute access.
Because it's going to be necessary to express things in terms of it
in parametrization markings, in order for mypy to show the expected
errors for names that are available dynamically but deliberately
static type errors.
This tests that mypy considers them not to be present.

That mypy is configured with `warn_unused_ignores = true` is key,
since that is what verifies that the type errors really do occur,
based on the suppressions written for them.
Which was causing a type error.
These conditional branches are kept so alternative implementations
can be examined, including if they need to be investigated to
satisfy some future requirement. But to be unittest.mock.patch
patchable, the approaches that would have a _USE_SHELL backing
attribute would be difficult to implement in a straightforward way,
which seems not to be needed or justified at this time.

Since that is not anticipated (except as an intermediate step in
development), these suppressions make sense, and they will also be
reported by mypy if the implementation changes to benefit from them
(so long as it is configured with warn_unused_ignores set to true).
With a special message when it is assigned a True value, which is
the dangerous use that underlies its deprecation.

The warnings are all DeprecationWarning.
This is with the intention of making it so that Git.USE_SHELL is
unittest.mock.patch patchable. However, while this moves in the
right direction, it can't be done this way, as the attribute is
found to be absent on the class, so when unittest.mock.patch
unpatches, it tries to delete the attribute it has set, which
fails due to the metaclass's USE_SHELL instance property having no
deleter.
This reimplements Git.USE_SHELL with no properties or descriptors.
The metaclass is still needed, but instad of defining properties it
defines __getattribute__ and __setattr__, which check for USE_SHELL
and warn, then invoke the default attribute access via super().

Likewise, in the Git class itself, a __getatttribute__ override is
introduced (not to be confused with __getattr__, which was already
present and handles attribute access when an attribute is otherwise
absent, unlike __getattribute__ which is always used). This checks
for reading USE_SHELL on an instance and warns, then invokes the
default attribute access via super().

Advantages:

- Git.USE_SHELL is again unittest.mock.patch patchable.

- AttributeError messages are automatically as before.

- It effectively is a simple attribute, yet with warning, so other
  unanticipated ways of accessing it may be less likely to break.

- The code is simpler, cleaner, and clearer. There is some
  overhead, but it is small, especially compared to a subprocess
  invocation as is done for performing most git operations.

However, this does introduce disadvantages that must be addressed:

- Although attribute access on Git instances was already highly
  dynamic, as "methods" are synthesized for git subcommands, this
  was and is not the case for the Git class itself, whose
  attributes remain exactly those that can be inferred without
  considering the existence of __getattribute__ and __setattr__ on
  the metaclass. So static type checkers need to be prevented from
  accounting for those metaclass methods in a way that causes them
  to infer that arbitrary class attribute access is allowed.

- The occurrence of Git.USE_SHELL in the Git.execute method (where
  the USE_SHELL attribute is actually examined) should be changed
  so it does not itself issue DeprecationWarning (it is not enough
  that by default a DeprecationWarning from there isn't displayed).
The existence of __getattr__ or __getattribute__ on a class causes
static type checkers like mypy to stop inferring that reads of
unrecognized instance attributes are static type errors. When the
class is a metaclass, this causes static type checkers to stop
inferring that reads of unrecognized class attributes, on classes
that use (i.e., that have as their type) the metaclass, are static
type errors.

The Git class itself defines __getattr__ and __getattribute__, but
Git objects' instance attributes really are dynamically synthesized
(in __getattr__). However, class attributes of Git are not dynamic,
even though _GitMeta defines __getattribute__. Therefore, something
special is needed so mypy infers nothing about Git class attributes
from the existence of _GitMeta.__getattribute__.

This takes the same approach as taken to the analogous problem with
__getattr__ at module level, defining __getattribute__ with a
different name first and then assigning that to __getattribute__
under an `if not TYPE_CHECKING:`. (Allowing static type checkers to
see the original definition allows them to find some kinds of type
errors in the body of the method, which is why the method isn't
just defined in the `if not TYPE_CHECKING`.)

Although it may not currently be necessary to hide __setattr__ too,
the same is done with it in _GitMeta as well. The reason is that
the intent may otherwise be subtly amgiguous to human readers and
maybe future tools: when __setattr__ exists, the effect of setting
a class attribute that did not previously exist is in principle
unclear, and might not make the attribute readble. (I am unsure if
this is the right choice; either approach seems reasonable.)
This changes how Git.execute itself accesses the USE_SHELL
attribute, so that its own read of it does not issue a warning.
Even though current type checkers don't seem to need it.

As noted in dffa930, it appears neither mypy nor pyright currently
needs `del util` to be guarded by `if not TYPE_CHECKING:` -- they
currently treat util as bound even without it, even with `del util`
present.

It is not obvious that this is the best type checker behavior or
that type checkers will continue to behave this way. (In addition,
it may help humans for all statements whose effects are intended
not to be considered for purposes of static typing to be guarded by
`if not TYPE_CHECKING:`.) So this guards the deletion of util the
same as the binding of __getattr__.

This also moves, clarifies, and expands the commented explanations
of what is going on.
In the USE_SHELL docstring:

- Restore the older wording "when executing git commands" rather
  than "to execute git commands". I've realized that longer phrase,
  which dates back to the introduction of USE_SHELL in 1c2dd54, is
  clearer, because readers less familiar with GitPython's overall
  design and operation will still not be misled into thinking
  USE_SHELL ever affects whether GitPython uses an external git
  command, versus some other mechanism, to do something.

- Give some more information about why USE_SHELL = True is unsafe
  (though further revision or clarification may be called for).

- Note some non-obvious aspects of USE_SHELL, that some of the way
  it interacts with other aspects of GitPython is not part of what
  is or has been documented about it, and in practice changes over
  time. The first example relates to gitpython-developers#1792; the second may help
  users understand why code that enables USE_SHELL on Windows, in
  addition to being unsafe there, often breaks immediately on other
  platforms; the third is included so the warnings in the expanded
  docstring are not interpreted as a new commitment that any shell
  syntax that may have a *desired* effect in some application will
  continue to have the same effect in the future.

- Cover a second application that might lead, or have led, to
  setting USE_SHELL to True, and explain what to do instead.

In test_successful_default_refresh_invalidates_cached_version_info:

- Rewrite the commented explanation of a special variant of that
  second application, where the usual easy alternatives are not
  used because part of the goal of the test is to check a "default"
  scenario that does not include either of them. This better
  explains why that choice is made in the test, and also hopefully
  will prevent anyone from thinking that test is a model for
  another situation (which, as noted, is unlikely to be the case
  even in unit tests).
And why this increases the importance of the warn_unused_ignored
mypy configuration option.
To distinguish it from the more general test.lib as well as from
the forthcoming test.deprecation.lib.
This creates a module test.deprecation.lib in the test suite for
a couple general helpers used in deprecation tests, one of which is
now used in two of the test modules and the other of which happens
only to be used in one but is concepually related to the first.

The assert_no_deprecation_warning context manager function fixes
two different flawed approaches to that, which were in use earlier:

- In test_basic, only DeprecationWarning and not the less
  significant PendingDeprecationWarning had been covere, which it
  should, at least for symmetry, since pytest.deprecated_call()
  treats it like DeprecationWarning.

  There was also a comment expressing a plan to make it filter only
  for warnings originating from GitPython, which was a flawed idea,
  as detailed below.

- In test_cmd_git, the flawed idea of attempting to filter only for
  warnings originating from GitPython was implemented. The problem
  with this is that it is heavily affected by stacklevel and its
  interaction with the pattern of calls through which the warning
  arises: warnings could actually emanate from code in GitPython's
  git module, but be registered as having come from test code, a
  callback in gitdb, smmap, or the standard library, or even the
  pytest test runner. Some of these are more likely than others,
  but all are possible especially considering the possibility of a
  bug in the value passed to warning.warn as stacklevel. (It may be
  valuable for such bugs to cause tests to fail, but they should not
  cause otherwise failing tests to wrongly pass.)

  It is probably feasible to implement such filtering successfully,
  but I don't think it's worthwhile for the small reduction in
  likelihood of failing later on an unrealted DeprecationWarning
  from somewhere else, especially considering that GitPython's main
  dependencies are so minimal.
@EliahKagan EliahKagan marked this pull request as ready for review March 30, 2024 08:15
@EliahKagan
Copy link
Contributor Author

EliahKagan commented Mar 31, 2024

I've added commit f92f4c3, which adds a paragraph to the docstring and reworked the deprecation warnings to make clearer that USE_SHELL is a security risk, and specifically that it greatly risks introducing command injection vulnerabilities in applications that use it. There may be more to say here or elsewhere, and when something is published about this, then maybe a link should be added. Alternatively, or in addition, a link to a CWE could be included; I think CWE-78 may be the most relevant one. Some further revisions could happen later.

(I've rebased the last two commits, relating to the metaclass alias, to appear after f92f4c3, which as described in the PR description is because I think the chance is higher than usual that you may wish for them to be dropped from the PR, or reverted. Note that this doesn't affect whether a metaclass is used, only whether a public type-checkable alias equivalent at runtime to type(Git) is to be added.)

I believe that either this PR or the security-related cautions related to USE_SHELL deprecation should be highlighted in the GitHub release notes for the next release of GitPython. A possible reason to do a release soon is to allow people to benefit from the new deprecation warnings, including and especially on USE_SHELL.

However, if not necessary, then it might make sense to wait a bit longer, because there are some other changes that might make sense to have come in too, such as removing the setup.py version stamping logic and making __version__ use importlib.metadata, and converting the project definition to be declarative as discussed in comments in #1716, which in turn could allow a couple more useful extras to be defined (in a way that is declaratively accessible without proliferating more separate requirements files), such as lint, test-minimal, and dev.

I've also been hoping to revise and update the documentation outside of the API reference, at least so that it provides correct installation instructions and up-to-date information about things like what object database GitPython currently defaults to using. Having current Sphinx documentation even outside the API reference would also have the benefit that sections could be added to it with information on important topics; this could include the risks of setting USE_SHELL to true, which would have the benefit that a search in doc/ would no longer only turn up the old entry in changes.rst advising to use it on Windows. (Though that could also, or in addition, be ameliorated by adding a brief note about that to the new changes.rst entry for the first release that carries the new deprecation warnings.)

My guess is that, at this point, all of that could proceed quickly, with the groundwork laid elsewhere and here for it, but I admit it is possible I am being overly optimistic.

Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(the first time I wrote this response, for the first time ever, the server dropped the connection on submission and with it the response, so this one might be briefer)

Thanks for this amazing work! I am particularly impressed by the fact that everything (or most of it) is unit-tested. This certainly helps with development, as there is no need to open up a REPL each time, while also assuring that the kind of warning (deprectation or user) is right.

Regarding the metaclass, I'd hope that most codebases will not inherit from Git itself, as the common usage is to use the instance on the Repo via repo.git. So I also think it should be fine.

I believe that either this PR or the security-related cautions related to USE_SHELL deprecation should be highlighted in the GitHub release notes for the next release of GitPython. A possible reason to do a release soon is to allow people to benefit from the new deprecation warnings, including and especially on USE_SHELL.

However, if not necessary, then it might make sense to wait a bit longer, because there are some other changes that might make sense to have come in too, such as removing the setup.py version stamping logic and making __version__ use importlib.metadata, and converting the project definition to be declarative as discussed in comments in #1716, which in turn could allow a couple more useful extras to be defined (in a way that is declaratively accessible without proliferating more separate requirements files), such as lint, test-minimal, and dev.

I think it's well worth having a release now, even if it means that another one will happen when the other important changes land.

@Byron Byron merged commit 4e626bd into gitpython-developers:main Mar 31, 2024
26 checks passed
@Byron Byron added the 📣highlight-in-changelog📣 Specifically highlight this PR in the changelog after it was created label Mar 31, 2024
renovate bot added a commit to allenporter/flux-local that referenced this pull request Mar 31, 2024
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
| [GitPython](https://togithub.com/gitpython-developers/GitPython) |
`==3.1.42` -> `==3.1.43` |
[![age](https://developer.mend.io/api/mc/badges/age/pypi/GitPython/3.1.43?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![adoption](https://developer.mend.io/api/mc/badges/adoption/pypi/GitPython/3.1.43?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![passing](https://developer.mend.io/api/mc/badges/compatibility/pypi/GitPython/3.1.42/3.1.43?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![confidence](https://developer.mend.io/api/mc/badges/confidence/pypi/GitPython/3.1.42/3.1.43?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|

---

### Release Notes

<details>
<summary>gitpython-developers/GitPython (GitPython)</summary>

###
[`v3.1.43`](https://togithub.com/gitpython-developers/GitPython/releases/tag/3.1.43)

[Compare
Source](https://togithub.com/gitpython-developers/GitPython/compare/3.1.42...3.1.43)

#### Particularly Important Changes

These are likely to affect you, please do take a careful look.

- Issue and test deprecation warnings by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1886
- Fix version_info cache invalidation, typing, parsing, and
serialization by [@&#8203;EliahKagan](https://togithub.com/EliahKagan)
in
[gitpython-developers/GitPython#1838
- Document manual refresh path treatment by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1839
- Improve static typing and docstrings related to git object types by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1859

#### Other Changes

- Test in Docker with Alpine Linux on CI by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1826
- Build online docs (RTD) with -W and dependencies by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1843
- Suggest full-path refresh() in failure message by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1844
- `repo.blame` and `repo.blame_incremental` now accept `None` as the
`rev` parameter. by [@&#8203;Gaubbe](https://togithub.com/Gaubbe) in
[gitpython-developers/GitPython#1846
- Make sure diff always uses the default diff driver when
`create_patch=True` by
[@&#8203;can-taslicukur](https://togithub.com/can-taslicukur) in
[gitpython-developers/GitPython#1832
- Revise docstrings, comments, and a few messages by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1850
- Expand what is included in the API Reference by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1855
- Restore building of documentation downloads by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1856
- Revise type annotations slightly by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1860
- Updating regex pattern to handle unicode whitespaces. by
[@&#8203;jcole-crowdstrike](https://togithub.com/jcole-crowdstrike) in
[gitpython-developers/GitPython#1853
- Use upgraded pip in test fixture virtual environment by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1864
- lint: replace `flake8` with `ruff` check by
[@&#8203;Borda](https://togithub.com/Borda) in
[gitpython-developers/GitPython#1862
- lint: switch Black with `ruff-format` by
[@&#8203;Borda](https://togithub.com/Borda) in
[gitpython-developers/GitPython#1865
- Update readme and tox.ini for recent tooling changes by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1868
- Split tox lint env into three envs, all safe by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1870
- Slightly broaden Ruff, and update and clarify tool configuration by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1871
- Add a "doc" extra for documentation build dependencies by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1872
- Describe `Submodule.__init__` parent_commit parameter by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1877
- Include TagObject in git.types.Tree_ish by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1878
- Improve Sphinx role usage, including linking Git manpages by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1879
- Replace all wildcard imports with explicit imports by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1880
- Clarify how tag objects are usually tree-ish and commit-ish by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1881

#### New Contributors

- [@&#8203;Gaubbe](https://togithub.com/Gaubbe) made their first
contribution in
[gitpython-developers/GitPython#1846
- [@&#8203;can-taslicukur](https://togithub.com/can-taslicukur) made
their first contribution in
[gitpython-developers/GitPython#1832
- [@&#8203;jcole-crowdstrike](https://togithub.com/jcole-crowdstrike)
made their first contribution in
[gitpython-developers/GitPython#1853
- [@&#8203;Borda](https://togithub.com/Borda) made their first
contribution in
[gitpython-developers/GitPython#1862

**Full Changelog**:
gitpython-developers/GitPython@3.1.42...3.1.43

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Enabled.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/allenporter/flux-local).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4yNjkuMiIsInVwZGF0ZWRJblZlciI6IjM3LjI2OS4yIiwidGFyZ2V0QnJhbmNoIjoibWFpbiJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
@EliahKagan EliahKagan deleted the deprecation-warnings branch March 31, 2024 14:38
@EliahKagan
Copy link
Contributor Author

EliahKagan commented Mar 31, 2024

(the first time I wrote this response, for the first time ever, the server dropped the connection on submission and with it the response, so this one might be briefer)

That sounds very frustrating! Although I usually try to copy the text of nontrivial GitHub comments to the clipboard and send them, this sort of thing sometimes happens to me even in the rare case I forget to do so, and I've lost useful text a couple of times.

Regarding the metaclass, I'd hope that most codebases will not inherit from Git itself, as the common usage is to use the instance on the Repo via repo.git. So I also think it should be fine.

I think there might be some uses for inheriting from Git, or that such uses may arise in the future, but I think this is unlikely to be common and unlikely to lead to metaclass conflicts. Hopefully people will open issues or discussion threads in the unlikely event that they have code they consider to have been working properly before but was broken by this.

I think it's well worth having a release now, even if it means that another one will happen when the other important changes land.

That makes sense and may actually reduce stress in that I will not feel rushed to do those other changes if something comes up to delay them (though I still anticipate they can happen soon).

By the way, the one failing CI test job for the merge commit here is due to #1676. It should go away if the job is rerun. (I'm not saying it needs to be rerun, only clarifying the cause.)

@EliahKagan
Copy link
Contributor Author

EliahKagan commented Mar 31, 2024

By the way, I'm pleased to see that, as expected, the documentation at https://gitpython.readthedocs.io/en/stable/ is fully working again, including theming and the API reference.

I've noticed that the changelog from changes.rst mentions:

A major visible change will be the added deprecation- or user-warnings, and greatly improved typing.

This pull request only adds deprecation warnings (specifically DeprecationWarning; Python also has PendingDeprecationWarning but I didn't use that for anything).

There were some deprecation-related UserWarnings from before, related to the expansion of environment variables. There is also a deprecation-related warning issued through the logging framework rather than warnings, for a deprecation that is less related to code and more to how an application is run. But no changes are made here that relate to either of those kinds of warnings. As far as I know nothing about them has changed in 3.1.43.

I can open a pull request to propose a change of the wording:

-A major visible change will be the added deprecation- or user-warnings, and greatly improved typing.
+A major visible change will be the added deprecation warnings, and greatly improved typing.

However, I don't want to recommend a change to changes.rst for a previous release in a situation where the inaccuracy is not major, if that's a practice you'd prefer to avoid. Please let me know if I should go ahead and open that PR. This differs from the situation in #1795 where the accuracy improvement was decisive and clearly justified.

The other reason I'm querying first is that maybe the opposite approach should be followed: when more detailed information is published about the risks of setting USE_SHELL to true, perhaps you would want the changes.rst entry to include a link to that, in which case it may makes to wait and have both modifications come in at the same time.

lettuce-bot bot added a commit to lettuce-financial/github-bot-signed-commit that referenced this pull request Apr 1, 2024
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
| [GitPython](https://togithub.com/gitpython-developers/GitPython) |
`==3.1.42` -> `==3.1.43` |
[![age](https://developer.mend.io/api/mc/badges/age/pypi/GitPython/3.1.43?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![adoption](https://developer.mend.io/api/mc/badges/adoption/pypi/GitPython/3.1.43?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![passing](https://developer.mend.io/api/mc/badges/compatibility/pypi/GitPython/3.1.42/3.1.43?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![confidence](https://developer.mend.io/api/mc/badges/confidence/pypi/GitPython/3.1.42/3.1.43?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|

---

### Release Notes

<details>
<summary>gitpython-developers/GitPython (GitPython)</summary>

###
[`v3.1.43`](https://togithub.com/gitpython-developers/GitPython/releases/tag/3.1.43)

[Compare
Source](https://togithub.com/gitpython-developers/GitPython/compare/3.1.42...3.1.43)

#### Particularly Important Changes

These are likely to affect you, please do take a careful look.

- Issue and test deprecation warnings by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1886
- Fix version_info cache invalidation, typing, parsing, and
serialization by [@&#8203;EliahKagan](https://togithub.com/EliahKagan)
in
[gitpython-developers/GitPython#1838
- Document manual refresh path treatment by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1839
- Improve static typing and docstrings related to git object types by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1859

#### Other Changes

- Test in Docker with Alpine Linux on CI by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1826
- Build online docs (RTD) with -W and dependencies by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1843
- Suggest full-path refresh() in failure message by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1844
- `repo.blame` and `repo.blame_incremental` now accept `None` as the
`rev` parameter. by [@&#8203;Gaubbe](https://togithub.com/Gaubbe) in
[gitpython-developers/GitPython#1846
- Make sure diff always uses the default diff driver when
`create_patch=True` by
[@&#8203;can-taslicukur](https://togithub.com/can-taslicukur) in
[gitpython-developers/GitPython#1832
- Revise docstrings, comments, and a few messages by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1850
- Expand what is included in the API Reference by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1855
- Restore building of documentation downloads by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1856
- Revise type annotations slightly by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1860
- Updating regex pattern to handle unicode whitespaces. by
[@&#8203;jcole-crowdstrike](https://togithub.com/jcole-crowdstrike) in
[gitpython-developers/GitPython#1853
- Use upgraded pip in test fixture virtual environment by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1864
- lint: replace `flake8` with `ruff` check by
[@&#8203;Borda](https://togithub.com/Borda) in
[gitpython-developers/GitPython#1862
- lint: switch Black with `ruff-format` by
[@&#8203;Borda](https://togithub.com/Borda) in
[gitpython-developers/GitPython#1865
- Update readme and tox.ini for recent tooling changes by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1868
- Split tox lint env into three envs, all safe by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1870
- Slightly broaden Ruff, and update and clarify tool configuration by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1871
- Add a "doc" extra for documentation build dependencies by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1872
- Describe `Submodule.__init__` parent_commit parameter by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1877
- Include TagObject in git.types.Tree_ish by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1878
- Improve Sphinx role usage, including linking Git manpages by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1879
- Replace all wildcard imports with explicit imports by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1880
- Clarify how tag objects are usually tree-ish and commit-ish by
[@&#8203;EliahKagan](https://togithub.com/EliahKagan) in
[gitpython-developers/GitPython#1881

#### New Contributors

- [@&#8203;Gaubbe](https://togithub.com/Gaubbe) made their first
contribution in
[gitpython-developers/GitPython#1846
- [@&#8203;can-taslicukur](https://togithub.com/can-taslicukur) made
their first contribution in
[gitpython-developers/GitPython#1832
- [@&#8203;jcole-crowdstrike](https://togithub.com/jcole-crowdstrike)
made their first contribution in
[gitpython-developers/GitPython#1853
- [@&#8203;Borda](https://togithub.com/Borda) made their first
contribution in
[gitpython-developers/GitPython#1862

**Full Changelog**:
gitpython-developers/GitPython@3.1.42...3.1.43

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about these
updates again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/lettuce-financial/github-bot-signed-commit).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4yNjkuMiIsInVwZGF0ZWRJblZlciI6IjM3LjI2OS4yIiwidGFyZ2V0QnJhbmNoIjoibWFpbiJ9-->
@Byron
Copy link
Member

Byron commented Apr 2, 2024

Regarding changes.rst, I certainly don't mind a retroactive changes even if it's minor, for correctness.

Knowing how much I struggle to formulate non-standard messages in changes.rst, I'd definitely prefer these to be made as part of the PR that introduces notable changes. In a way, some PRs that receive the highlight-in-changelog label can consider adding their own paragraph in changes.rst as well for the upcoming release.
Doing so should solve this specific problem which I think could easily happen again otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📣highlight-in-changelog📣 Specifically highlight this PR in the changelog after it was created
Development

Successfully merging this pull request may close these issues.

None yet

2 participants