-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pre and post treatment of expression when cse=True in lambdify #26463
base: master
Are you sure you want to change the base?
Conversation
✅ Hi, I am the SymPy bot. I'm here to help you write a release notes entry. Please read the guide on how to write release notes. Your release notes are in good order. Here is what the release notes will look like:
This will be added to https://github.com/sympy/sympy/wiki/Release-Notes-for-1.13. Click here to see the pull request description that was parsed.
|
Hi, thanks for preparing this. I have a few questions:
|
Average on a 100 executions : When there are Derivatives, I'm having trouble testing it, I tried :
I get the error :
It seems that the derivatives aren't dummyfied in the matrix, this isn't a result of my changes since my changes are only applied when cse is True. Maybe I'm not using this right, otherwise, this could be another issue. |
Benchmark results from GitHub Actions Lower numbers are good, higher numbers are bad. A ratio less than 1 Significantly changed benchmark results (PR vs master) Significantly changed benchmark results (master vs previous release) | Change | Before [2487dbb5] | After [e89ee937] | Ratio | Benchmark (Parameter) |
|----------|----------------------|---------------------|---------|----------------------------------------------------------------------|
| - | 71.9±3ms | 44.1±0.4ms | 0.61 | integrate.TimeIntegrationRisch02.time_doit(10) |
| - | 67.3±1ms | 43.0±0.3ms | 0.64 | integrate.TimeIntegrationRisch02.time_doit_risch(10) |
| + | 18.8±0.2μs | 30.1±0.4μs | 1.6 | integrate.TimeIntegrationRisch03.time_doit(1) |
| - | 5.49±0.05ms | 2.87±0.04ms | 0.52 | logic.LogicSuite.time_load_file |
| - | 74.2±0.5ms | 29.1±0.2ms | 0.39 | polys.TimeGCD_GaussInt.time_op(1, 'dense') |
| - | 26.0±0.1ms | 17.1±0.06ms | 0.66 | polys.TimeGCD_GaussInt.time_op(1, 'expr') |
| - | 73.8±0.5ms | 29.1±0.2ms | 0.39 | polys.TimeGCD_GaussInt.time_op(1, 'sparse') |
| - | 260±1ms | 126±0.2ms | 0.48 | polys.TimeGCD_GaussInt.time_op(2, 'dense') |
| - | 258±1ms | 126±0.3ms | 0.49 | polys.TimeGCD_GaussInt.time_op(2, 'sparse') |
| - | 665±4ms | 376±2ms | 0.57 | polys.TimeGCD_GaussInt.time_op(3, 'dense') |
| - | 664±4ms | 377±2ms | 0.57 | polys.TimeGCD_GaussInt.time_op(3, 'sparse') |
| - | 505±4μs | 285±0.8μs | 0.56 | polys.TimeGCD_LinearDenseQuadraticGCD.time_op(1, 'dense') |
| - | 1.77±0.02ms | 1.07±0ms | 0.6 | polys.TimeGCD_LinearDenseQuadraticGCD.time_op(2, 'dense') |
| - | 5.83±0.03ms | 3.05±0.01ms | 0.52 | polys.TimeGCD_LinearDenseQuadraticGCD.time_op(3, 'dense') |
| - | 457±2μs | 230±2μs | 0.5 | polys.TimeGCD_QuadraticNonMonicGCD.time_op(1, 'dense') |
| - | 1.48±0ms | 670±8μs | 0.45 | polys.TimeGCD_QuadraticNonMonicGCD.time_op(2, 'dense') |
| - | 4.98±0.04ms | 1.67±0.01ms | 0.34 | polys.TimeGCD_QuadraticNonMonicGCD.time_op(3, 'dense') |
| - | 382±1μs | 208±1μs | 0.54 | polys.TimeGCD_SparseGCDHighDegree.time_op(1, 'dense') |
| - | 2.44±0.01ms | 1.25±0.01ms | 0.51 | polys.TimeGCD_SparseGCDHighDegree.time_op(3, 'dense') |
| - | 10.4±0.09ms | 4.44±0.03ms | 0.43 | polys.TimeGCD_SparseGCDHighDegree.time_op(5, 'dense') |
| - | 366±4μs | 170±0.3μs | 0.46 | polys.TimeGCD_SparseNonMonicQuadratic.time_op(1, 'dense') |
| - | 2.53±0.01ms | 923±3μs | 0.36 | polys.TimeGCD_SparseNonMonicQuadratic.time_op(3, 'dense') |
| - | 9.75±0.03ms | 2.67±0.02ms | 0.27 | polys.TimeGCD_SparseNonMonicQuadratic.time_op(5, 'dense') |
| - | 1.04±0.01ms | 428±5μs | 0.41 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(3, 'dense') |
| - | 1.74±0.01ms | 512±6μs | 0.29 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(3, 'sparse') |
| - | 5.87±0.06ms | 1.79±0.01ms | 0.31 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(5, 'dense') |
| - | 8.50±0.1ms | 1.48±0.01ms | 0.17 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(5, 'sparse') |
| - | 286±2μs | 64.7±0.6μs | 0.23 | polys.TimePREM_QuadraticNonMonicGCD.time_op(1, 'sparse') |
| - | 3.45±0.05ms | 396±3μs | 0.11 | polys.TimePREM_QuadraticNonMonicGCD.time_op(3, 'dense') |
| - | 4.01±0.01ms | 279±2μs | 0.07 | polys.TimePREM_QuadraticNonMonicGCD.time_op(3, 'sparse') |
| - | 7.28±0.08ms | 1.26±0.01ms | 0.17 | polys.TimePREM_QuadraticNonMonicGCD.time_op(5, 'dense') |
| - | 8.95±0.07ms | 830±3μs | 0.09 | polys.TimePREM_QuadraticNonMonicGCD.time_op(5, 'sparse') |
| - | 5.01±0.04ms | 2.95±0.01ms | 0.59 | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(2, 'sparse') |
| - | 12.2±0.07ms | 6.60±0.03ms | 0.54 | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(3, 'dense') |
| - | 22.6±0.2ms | 8.94±0.02ms | 0.4 | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(3, 'sparse') |
| - | 5.26±0.02ms | 862±3μs | 0.16 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(1, 'sparse') |
| - | 12.8±0.1ms | 6.97±0.01ms | 0.54 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(2, 'sparse') |
| - | 104±1ms | 26.1±0.1ms | 0.25 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(3, 'dense') |
| - | 168±0.9ms | 53.0±0.3ms | 0.32 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(3, 'sparse') |
| - | 173±2μs | 111±1μs | 0.64 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(1, 'dense') |
| - | 363±6μs | 216±1μs | 0.59 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(1, 'sparse') |
| - | 4.32±0.02ms | 833±4μs | 0.19 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(3, 'dense') |
| - | 5.31±0.04ms | 381±2μs | 0.07 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(3, 'sparse') |
| - | 19.9±0.05ms | 2.82±0.01ms | 0.14 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(5, 'dense') |
| - | 23.2±0.3ms | 617±3μs | 0.03 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(5, 'sparse') |
| - | 482±4μs | 135±2μs | 0.28 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(1, 'sparse') |
| - | 4.84±0.02ms | 619±3μs | 0.13 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(3, 'dense') |
| - | 5.48±0.05ms | 137±1μs | 0.03 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(3, 'sparse') |
| - | 13.7±0.1ms | 1.29±0.01ms | 0.09 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(5, 'dense') |
| - | 14.3±0.09ms | 145±10μs | 0.01 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(5, 'sparse') |
| - | 136±0.5μs | 75.1±0.2μs | 0.55 | solve.TimeMatrixOperations.time_rref(3, 0) |
| - | 254±1μs | 87.8±0.2μs | 0.35 | solve.TimeMatrixOperations.time_rref(4, 0) |
| - | 24.5±0.1ms | 10.2±0.03ms | 0.42 | solve.TimeSolveLinSys189x49.time_solve_lin_sys |
| - | 28.2±0.3ms | 15.3±0.06ms | 0.54 | solve.TimeSparseSystem.time_linsolve_Aaug(20) |
| - | 55.0±0.4ms | 25.1±0.2ms | 0.46 | solve.TimeSparseSystem.time_linsolve_Aaug(30) |
| - | 28.0±0.2ms | 15.2±0.08ms | 0.54 | solve.TimeSparseSystem.time_linsolve_Ab(20) |
| - | 54.0±0.3ms | 24.7±0.05ms | 0.46 | solve.TimeSparseSystem.time_linsolve_Ab(30) |
Full benchmark results can be found as artifacts in GitHub Actions |
You might consider using |
I looked into it, but I'm not sure if it would be compatible for dummyfying different derivatives, from what I've seen, it can only replace atoms like x, y, ... And a derivative has atoms as arguments, so I don't think it's an atom, but please tell me if I'm wrong, it would be more practical I've detected another problem, cses can also replace combinations of the replaced Derivatives in the pre-treatment, I've solved this issue, I'm in the process of building a test with some matrix, I will push the updates soon. |
About the time of execution, in this new case(with more arguments ans more instances of Derivatives to replace in the expression) :
(replaced time of execution, the formers were false, it was longer because of print) Another example (that wasn't added as a test in the test file) :
Here :
If you want me to do more complicated tests or if you have any other question, please let me know |
To test the effects of slowdown you need something like |
I've done several tests to see the duration of execution.
Here, activating cse reduces the time of execution by half with the implementation. For
Here, activating cse reduces the time of execution by 63% with the implementation. For
Here, activating cse reduces the time of execution by 72% with the implementation. If you want to try it for youselves, I've put the tests I've used here (the execution time should vary depending on the user's computer and sorry Ifor the length of the tes's code , I could have made it shorter) Do you want me to test this with a larger n for |
Sorry if I misstated. The slowdown I'm interested in is the timing of lambdify(). If we do pre and post substitutions, there is a slowdown associated with executing the lambdify function. It will only be seen on very large expressions that are full of |
Are you saying that your modification makes lambdify faster? |
x0*l1*m2*(sin(q1)*sin(q2) + cos(q1)*cos(q2))], | ||
[-l1*m2*cos(q2), | ||
x0*l1*m2*(sin(q1)*sin(q2) + cos(q1)*cos(q2)), | ||
l1**2*m2]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These expressions would be more appropriate if the u
values were all swapped with derivatives: u1 = Derivative(q1(t) t)
, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
u values are : u1, u2 =q1.diff(l1), q2.diff(m0)
, so already derivatives, did you mean putting more of them in the expression of the matrix? Or do you mean putting Derivatives directly in the matrix? If so, I haven't handled the case were the expression of a Derivative is directly in the expression and not also as an argument, I will add it
No problem, I wasn't sure sorry! I'll do that today(so max in 8 hours) and post the new tests results, but I won't be able to compare to before the implementation(since it didn't work, although I could compare to the execution time of expressions without Derivative terms)
It seems counterintuitive, I still wanted to add the results since the goal was to compare, but the fact that the execution is faster with the implementation might be due to other processes that were running on my computer (I didn't execute anything else, but I don't see any other explanation since the kane matrix didn't have any Derivative so didn't need modifications by the pre treatment) |
Ideally a fix would work with any types of SymPy objects where this may occur. |
No apologies are needed because you have not done anything wrong and there are no time limits so please take whatever time you need/desire. |
And
With this, I get the original error, so ideally, a solution for this case should be implemented as well. I checked and the pre and post operations did not change the expression. And
Worked correctly For other classes of objects, I think my solution would work, we would just need to add to the isinstance line 238 in lambdify.py all the other classes we want to mask, but it should be tested. Although, concerning the integral, I tried to pass it as an argument (without cse activated) and I got an error, but maybe I didn't write the command lines correctly.
Thank you |
I did the new tests, here are the results on average for a 100 executions :
Execution with cse is almost the same as the one without.
Execution with cse is 0.0003% shorter.
Execution with cse is 0.01% longer.
Execution with cse is 0.003% shorter. The percentage (updated with the new changes to accomodate lists) is really small so I think on average the execution time with and without cse are equivalent but if you want to execute yourselves the tests I've done and measure the average time spent per execution, they are here I changed and measured the time of execution with different Matrices(or list of lists here) similar to the kane built-in(the built-in function didn't take Derivatives as arguments so I made one similar in dimensions) Also, one of the test before the last commit failed because some tests sent tuples as expr to lambdify, so I updated the description on the function : it also takes tuple type of expressions as expr |
Before, when there were Derivatives in expr ans args given to lambdify with cse enabled, there was an error because the cse treatment changed the arguments of the Derivative object. With this implementation, the expression is pre-treated by a function to mask the instances of Derivative objects, then the cse process is applied and finally we do a post-treatment to put back the Derivative expressions in expr.
Rebased with more explicit commit message |
Fixes #26404
These additions to the lambdify file fixes the problems seen in the issue mentionned above.
What is changed
When lambdify is called and the optionnal parameter cse==True, then we check if there are Derivatives in the arguments of the function. If so, they are replaced by variable names to prevent them from being ignored later in the cse process (see issue page here to see the problem). After this and the cse process is applied, we do a post process : we change back the Derivatives replaced at the beginning to their original values in the expression and in the cse replacements (ie the Derivatives).
To do this, I have added two functions in lambdify.py : pre_treatment_cse(args, expr) (line 181 in lambdify.py) and post_treatment_cse(dictionnary, args, expr, cses) (line 247 in lambdify.py)
I have also added new tests in the test_lambdify.py file. (end of the file)
If you have any questions, would like more comments in the code.... , please let me know!
Release Notes