Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove deepcopy steps during parameterize_notebook. #673

Merged
merged 1 commit into from Aug 12, 2022

Conversation

surdouski
Copy link
Contributor

This PR removes the nb = copy.deepcopy(nb) steps inside parameterize.py and execute.py. This does not fix any tracked issue I know of. What this PR solves are some issues and annoyances surrounding the deepcopy (and pickle) standard libraries.

# Copy the nb object to avoid polluting the input
nb = copy.deepcopy(nb)

Currently, notebooks that contain cells or parameters with objects without a __dict__ require some heavy object/class patching of their magic methods. Also, anything that is unable to be serialized with pickle will run into issues here as well. Removing all of the copy.deepcopy(nb) calls fixes this issue.

The removal of these can be done safely because the deepcopy currently has no useful behavior; just looking at the code surrounding these calls makes it somewhat apparent. All tests are currently passing on my machine with the removal of these calls to deepcopy, so there would have to be some undocumented use-cases with no matching tests if there are any issues here.

@surdouski surdouski force-pushed the remove-deepcopy-branch branch 2 times, most recently from 8c3a32f to 6966008 Compare August 11, 2022 04:09
Copy link
Member

@rohitsanj rohitsanj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@codecov
Copy link

codecov bot commented Aug 12, 2022

Codecov Report

Merging #673 (6966008) into main (98013f0) will increase coverage by 0.03%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #673      +/-   ##
==========================================
+ Coverage   91.72%   91.75%   +0.03%     
==========================================
  Files          17       17              
  Lines        1583     1589       +6     
==========================================
+ Hits         1452     1458       +6     
  Misses        131      131              
Impacted Files Coverage Δ
papermill/engines.py 98.37% <100.00%> (+0.09%) ⬆️
papermill/execute.py 100.00% <100.00%> (ø)
papermill/parameterize.py 97.91% <100.00%> (-0.05%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9f02383...6966008. Read the comment docs.

@rohitsanj rohitsanj merged commit d699e2a into nteract:main Aug 12, 2022
@edublancas
Copy link
Contributor

Currently, notebooks that contain cells or parameters with objects without a dict require some heavy object/class patching of their magic methods. Also, anything that is unable to be serialized with pickle will run into issues here as well. Removing all of the copy.deepcopy(nb) calls fixes this issue.

Hi @surdouski , I'm interested in learning more about this use case; under which conditions a notebook contains non-serializable parameters?

@surdouski
Copy link
Contributor Author

@edublancas Since I won't be updating papermill until there is a new release on pip (or until I decide to build a wheel myself, but I am lazy), you can take a look at this directory of my repo where I implement any patching needed. Some of it is for copy.deepcopy related things and some of it is json.loads/dumps related things. Since I've gotten this taken care of in papermill, the next PR's (when I get around to it) will probably be on nbformat -- related to the usage of json (I might get around to this, I might not, idk).

The reason that I need to patch things is because pythonnet doesn't set attributes on the objects, which means the call to __dict__ raises an exception. In addition, I require custom serialization/deserialization patterns for certain .NET objects such as things containing a Uri.

If you have any specific questions, let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants