-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restructure project directory layout and exception hierarchy #271
Merged
peternowee
merged 3 commits into
pydot:master
from
peternowee:feature/project-structure
Jul 30, 2021
Merged
Restructure project directory layout and exception hierarchy #271
peternowee
merged 3 commits into
pydot:master
from
peternowee:feature/project-structure
Jul 30, 2021
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This commit brings the directory layout in line with current recommendations by the Python Packaging Authority and others. Before After |-- setup.py |-- setup.py | |-- src/ | | `-- pydot/ | | |-- __init__.py |-- pydot.py | |-- core.py |-- dot_parser.py | `-- dot_parser.py `-- test/ `-- test/ `-- pydot_unittest.py `-- pydot_unittest.py There are many opinions on what would be the best directory layout for a Python project, particularly on whether or not to use an additional `src/` layer. For example: - https://blog.ionelmc.ro/2014/05/25/python-packaging/#the-structure - pypa/packaging.python.org#320 - https://github.com/pypa/packaging.python.org/blob/7866cb69f1aa290a13c94da77cfb46a079cfcfa2/source/tutorials/packaging-projects.rst#a-simple-project - https://stackoverflow.com/questions/193161/what-is-the-best-project-structure-for-a-python-application Suffice to say, I can basically understand the arguments in favor of a `src/` layout and do not find the arguments against it strong enough not to use it. Moved the dunder global variables (`__version__` etc.) to `__init__.py` to ensure they remain available to the user doing `import pydot`. (They would not be included in `from pydot.core import *` because their names start with an underscore and I did not want to start keeping `__all__` just for this.) A test for `pydot.__version__` is added to the testsuite as well. A question that remains open for the moment is whether we should reduce the weight of pydot "core" by splitting off functionality, most notably the parsing functionality. Although that would reduce pydot loading time, it would also make it more difficult for users to discover and use that functionality. For details see: pydot#271 (comment) For now, things are kept as-is, namely that `pydot` always imports `dot_parser`. This is now accomplished via `__init__.py` and `core.py`. Discussed in pydot#171, pydot#230 and pydot#271. Special thanks to Kostis Anagnostopoulos (@ankostis) for his advice.
This commit removes a check from the test suite that was introduced by commit 50bd730 ([pydot#164][164]). The check raised an error in case the current working directory was the top of the source tree. The purpose was to prevent testing the pydot sources from the source tree instead of the installed pydot. The risk of that inadvertently happening is considerably lower now that the sources were moved to subdirectory `src/`. Also note that [Python _running a script_ does not by default search the working directory][1] anyway. The removal of the check makes the `--no-check` option redundant. However, we cannot remove it yet because that would cause an `unrecognized arguments` error in existing scripts. `argparse` does not support [deprecating CLI arguments][2] yet. So, instead we print a deprecation warning. The remaining `parse_args()` and `import argparse` can be removed in a later major release of pydot. [1]: https://chrisyeh96.github.io/2017/08/08/definitive-guide-python-imports.html [2]: https://bugs.python.org/issue39467 [164]: pydot#164
- Module structure following example from NetworkX and other projects. This adds an `import pydot` to `core.py`, which is cyclic/circular, but necessary to make the exceptions available. Cyclic imports are not uncommon and Python can handle this. Using the exceptions in pydot source code now requires prefixing `pydot.`, for example: raise pydot.Error("Message") - Exception hierarchy changes. See ChangeLog and class docstrings in this commit for details. Additional background notes: - Removal of the unused exception class `InvocationException`: It was introduced in 842173c of 2008 (v1.0.2), but got in disuse when commits 9b3c1a1 and bc639e7 of 2016 (v1.2.0) fell back to using `Exception` and `assert` (`AssertionError`) again. If we ever need a custom class like this again in the future, it would probably have a different signature for context data (e.g. a DOT string, input and output), different parent class (or classes, e.g. `PydotException, CalledProcessError`) and perhaps a different name (e.g. ending in `...Error`), so no need to keep the old class around for that. Further additions to the exception hierarchy are likely before the final release of pydot 2.0. Discussed in pydot#171, pydot#230 and pydot#271.
peternowee
force-pushed
the
feature/project-structure
branch
from
July 30, 2021 18:57
f830c5b
to
90936e7
Compare
peternowee
changed the title
[WIP] Restructure project directory layout and exception hierarchy
Restructure project directory layout and exception hierarchy
Jul 30, 2021
@ankostis I will merge this now already, because I want to keep things moving, but feel free to still comment on it later. Thank you for all your help and hope you are doing well. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR restructures the project directory layout and exception hierarchy to bring them in line with common practices and provide a better foundation for custom exceptions and logging. It follows up on discussions in #171 and #230.
See the individual commit messages and new ChangeLog entries for additional remarks. I won't copy them here.
Reviews/comments welcome. I won't merge it before 2021-07-26 anyway, later if necessary to resolve any comments.
@ankostis: In response to some of your comments of last year:
I now define them in
exceptions.py
and in__init__.py
dofrom pydot.exceptions import *
. That comes down to the same as what you meant, right?Hmm, I am not sure if I followed up this advice completely. Taken literally, at least I did not copy-paste everything into
__init__.py
. Only__author__
,__version__
and__license__
. However, I did renamepydot.py
tocore.py
and then addedfrom pydot.core import *
to__init__.py
, which basically comes down to importing "the wholepydot.py
" after all. Not sure if you meant to advise against that as well, but I do not really see how we can avoid that without breaking things. See example below.I have been thinking about how to apply this to pydot, but run into objections regarding backwards compatibility and user-friendliness.
The obvious candidate to be split off from core is the parser-related code, which includes a dependency on PyParsing. Most of it is already in a separate module (
dot_parser.py
). There are only two functions incore.py
that depend ondot_parser
:graph_from_dot_data()
andgraph_from_dot_file()
. If we would move those out ofcore.py
and drop the import ofdot_parser
, the loading time for pydot would be cut by 46% (from 89 ms to 48 ms, as measured bypython -X importtime -c "import pydot"
, Python 3.7.3, Celeron N3350, lowest of multiple runs). This reduction is mostly (98%) attributable to not importingpyparsing
anymore.However, I am not very eager to remove parsing from the standard functionality offered by pydot. I mean, currently
import pydot
means you get access to all its functionality immediately. If we take the parsing functionality out, users would need to start thinking about how to import which part of pydot. To use the parsing functions they would need to doimport pydot.dot_parser
,from pydot import dot_parser
orfrom pydot.dot_parser import graph_from_dot_data, graph_from_dot_file
. This is not only a breaking change for existing scripts (we could at best smooth the transition by adding a deprecation warning in pydot 2.0 and postponing the removal from core to pydot 3.0), but it is also inconvenient for new users and new scripts. An additional import would be needed, different prefixes need to be used (e.g.pydot.dot_parser.graph_from_dot_data()
) and in the interactive interpreter, using tab completion onpydot.
would not show these functions anymore for easy discovery of what pydot offers.So, I was thinking if instead we could maybe just turn it around: Keep the parsing as default functionality of pydot, but offer users that are looking to reduce load time a way to manually import only the (core) modules they need. However, I am getting the impression that this is simply impossible, because
__init__.py
always gets executed, even if the user only imports a specific module inside the package. We need__init__.py
to carry the parsing stuff if we want to offer that as default functionality of pydot, but then we do not want that to load when a user chooses to import only thecore
andexceptions
modules for example. Those two wishes are simply incompatible with each other, right? Hope you understand what I am trying to say and please correct me if I am wrong.Alternatively, if making pydot lighter is the goal, we could just "lazy import"
dot_parser
by moving the import to inside the functiongraph_from_dot_data()
. A lot can be said about this practice, see this SO question for example. I think we can get over the style objections, and as I understand the performance overhead of repeated import statements for an already imported module is minimal. However, a more important issue may be that any problems met during the import will only show up once the function is called. On the other hand, I assume parsing is one of the first steps in most user scripts, so any import problems should still show up early on. Hope to hear your thoughts. This can also be done separate of this PR.One might also consider splitting off the functions related to calling Graphviz to a separate module. However, when it comes to offering users the option not to load that module, similar questions as for splitting off the parsing will come up. Calling Graphviz is the reason most people use pydot and the core pydot class
Dot
has methodsDot.create()
andDot.write()
that depend heavily on those functions. Again, we could move the import to insideDot.create()
, but in this case that would really make import problems show up much too late, because users will likely call Graphviz only at the end of their scripts. Finally, the performance benefit is not as strong here: Not loading Graphviz output functions and not importingos
andsubprocess
reduces pydot load time by around 6%.Again, hope to hear your thoughts. In both cases, we can also decide about it later. No need to do everything in this PR already.