Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-37984] Matrix throws "Method code too large! error" on #355

Merged
merged 38 commits into from
Oct 30, 2019

Conversation

bitwiseman
Copy link
Contributor

@bitwiseman bitwiseman commented Oct 11, 2019

  • JENKINS issue(s):
  • Description:
    There are 3 main parts/goal for this PR:
    • Fix 250 item limit for stages, parallel, and matrix sections
    • Reduce occurrence of "Method code too large" errors (caused single methods exceeding the 64k byte-code per method limit, often the script initialization method) by splitting large methods into multiple smaller one
    • Reduce occurrence of "Class too large" errors (caused by exceeding the 64k constants per class limit) by aggressively moving as much code as possible out of the script class into generated classes

It generally achieves all these goals for both matrix and non-matrix scenarios, with the caveat that when def variables are present it falls back to using more closures making it less effective at avoiding "Method code too large" errors.

Tests have been added to cover the new scenarios, but not all of them can run in CI and not time out.

  • Documentation changes:
    • None. Transparent to user, just makes life better.
  • Users/aliases to notify:
    • @jenkinsci/pipeline-model-definition-plugin-developers

Copy link
Member

@dwnusbaum dwnusbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the failing tests are because the new classes/methods you are generating are not being CPS-transformed, causing the sleep steps to not work correctly in the tests, but I'm not totally sure.

@@ -837,7 +899,7 @@ class RuntimeASTTransformer {
// TODO: Do I need to create a new ModelASTStage each time? I don't think so.
String name = "Matrix - " + cellLabels.join(", ")

return ctorX(ClassHelper.make(Stage.class),
return ctorXFunction(ClassHelper.make(Stage.class),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how this affects variable resolution for stuff inside of the matrix, for example if someone defines a Groovy method at the top of the Jenkinsfile (not recommended, but it happens), now that the matrix stages are generated inside of a static method in a separate class, will those methods still be resolvable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Tried it manually and it worked.

@abayer
Is there a test for this scenario? (The only down side to extensive testing it not know where all the tests are.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are tests for this and they caught the problem but I'm not sure that they cover all the cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I think it would be a good idea to add a system property/public static non-final variable to disable the new behavior and force the old code path in case it causes issues for existing Declarative users in scenarios that aren't covered by tests. Something like this in RuntimeASTTransformer:

@SuppressFBWarnings(warningId="the real code for the non-final static vars are bad warning", justification="For access from script console")
public static boolean FORCE_SINGLE_EXPRESSION_TRANSFORMATION = Boolean.getBoolean(RuntimeASTTransformer.class.getName() + ".FORCE_SINGLE_EXPRESSION_TRANSFORMATION");

And then inside of ctorXFunction and friends you look at FORCE_SINGLE_EXPRESSION_TRANSFORMATION (or whatever you want to call it) and fall back to ctorX if it is true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dwnusbaum
Solid idea. Done.

@bitwiseman bitwiseman changed the title Matrix/method size [JENKINS-37984] Matrix throws "Method code too large! error" on Oct 15, 2019
The resolve closures in the way pipeline requires we had to move to instanciated classes.
CpsScript is written in java. For some reason writing RuntimeContainerBase
in groovy resulted in build error that I couldn't track down to fix.

Moving to Java made it go away.
Turned of a few tests that overlapped and switched another
to ensure matrix still works with script splitting
@jglick
Copy link
Member

jglick commented Oct 24, 2019

at some point @abayer switched…

This was #174, for reference.

How would evaluate help here?

Bear in mind that I only dimly understand what this PR is even doing, or how the original code works for that matter. My understanding is that (hand-waiving here) you are taking a Declarative script and transforming it into an actual executable program, and that in this PR some of the code blocks are being moved to separate JVM classes without otherwise changing behavior. So the straightforward way to do something like that in Pipeline script (without making any reference to workflow-cps internals, or even exiting the sandbox) would be to replace

def somethingLong() {
  // dozens of lines here
}
somethingLong()

with, say,

def somethingLong = evaluate '''
{ ->
  // dozens of lines here
}
'''
somethingLong()

so that the closure gets compiled into a separate class, something like Script19 which will in fact be a subtype of CpsScript; see CpsGroovyShellFactory.makeConfig.

(Admittedly the above still makes for a long constant entry, but in the actual problem here, the “dozens of lines” would be computed at runtime somehow.)

what exactly does it open up that isn't already open?

It is hard to say, but by directly using this type from workflow-cps which was never intended for outside consumption, you are entering dangerous territory.

I note that none of the newly introduced tests seem to use RestartableJenkinsRule, so it is not even clear that the complicated script persistence mechanisms in workflow-cps will be engaged.

I understand that there are plenty of votes for this issue, but a lot of them presumably apply to users of Scripted, who will not be helped by this. Are there really so many people with massive Declarative scripts that trigger the limit to justify such a high-risk change?

@jglick
Copy link
Member

jglick commented Oct 24, 2019

On the other hand, this plugin is already horribly intertwined with workflow-cps internals it seems. My real recommendation at a high level is that the current execution mode of Declarative be pretty much frozen and deprecated; and that a separate mode be introduced which runs without low-level access to workflow-cps, using an interpreter rather than code generation (like a very simplified variant of the plugin prior to #174), but which does not support script blocks or any code outside pipeline, libraries, or any GString interpolation beyond simple variable substitution.

@bitwiseman
Copy link
Contributor Author

Are there really so many people with massive Declarative scripts that trigger the limit to justify such a high-risk change?

The way this was found was during matrix development. In that scenario, yes, it is easy to create declarative pipelines that encounter this.

@bitwiseman bitwiseman force-pushed the matrix/method-size branch 2 times, most recently from d663606 to 0c0bb40 Compare October 25, 2019 01:20
Most tests will still run with it turned on, but customers will have to enable it manually.
@bitwiseman
Copy link
Contributor Author

@jglick
A question for you:
I was able to get this working without extending CpsScript. Jenkins infra is borked, but I've tested it locally.

However, on reviewing with @dwnusbaum I ran into another issue. When pausing the generated classes are serialized. When they are deserialized, I don't believe they retain the reference to the current WorkflowScript. I'm still attempting to verify that - the classes are currently only called very early in the run, basically as the Declarative pipeline run starts. So, the current pause restart test works and I've also verified manually with some more involved scenarios that restart works, but it could be

My question: is there an event or other api that I can hook into get a reference to the current instance of WorkflowScript? I could have every generated method on the generated classes pass a reference to it, but ... ew.

@jglick
Copy link
Member

jglick commented Oct 29, 2019

In that scenario, yes, it is easy to create declarative pipelines that encounter this.

Because of the exponential combination of axes I suppose? This plugin is generating code (like a stage) for each combination?

is there an event or other api that I can hook into get a reference to the current instance of WorkflowScript?

Not that I know of. Normally if a class needs a reference to the root script (and it might not), it is either held as a strong reference and thus serialized as part of the program state (recall that Java serialization, and JBoss too, handles cycles), or passed in as an argument to particular methods. IIRC both are documented idioms when using src/ in libraries.

Bear in mind that I can just barely follow how program resumption works in Scripted when using nothing out of the ordinary (for example, only stock sandbox calls and steps). If you are talking about programmatically generating classes, I cannot make any promises.

@dwnusbaum
Copy link
Member

it is either held as a strong reference and thus serialized as part of the program state (recall that Java serialization, and JBoss too, handles cycles)

Oops, forgot that cycles are handled, sorry for the misleading info yesterday @bitwiseman. I guess in that case, going by the in-progress code you showed me yesterday, I would make the field that holds the script non-transient again and see if that works (not sure if you are saying you already tried that and it didn't work, or something else).

@bitwiseman
Copy link
Contributor Author

Because of the exponential combination of axes I suppose? This plugin is generating code (like a stage) for each combination?

Well, it does when this change is not enabled. With this it can reuse items from variables.

I would make the field that holds the script non-transient again and see if that works (not sure if you are saying you already tried that and it didn't work, or something else).

Ah, of, course, why didn't I think of that. Yay!

This eliminates the only security concern for this change.
@bitwiseman
Copy link
Contributor Author

Built and tested locally:

[INFO] Results:
[INFO]
[WARNING] Tests run: 1418, Failures: 0, Errors: 0, Skipped: 19

Comment on lines +830 to +834
// With script splitting stagesExpression is a method that creates a new stages expression at runtime
// If disabled, we still want to make this a script closure call so we can reuse it instead of generating code for every cell
if (!SCRIPT_SPLITTING_TRANSFORMATION) {
stagesExpression = wrapper.asWrappedScriptContextVariable(stagesExpression, true)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jglick Per your comment, this forces the use of a single expression (method or a closure), which is then called for each cell instead of generating code for each cell.

*
* @author Liam Newman
*/
public class RuntimeContainerBase extends Script implements Serializable {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jglick @dwnusbaum
Now extends Script instead of CpsScript. Yay!

@bitwiseman bitwiseman force-pushed the matrix/method-size branch 2 times, most recently from 86e4833 to 00eafd5 Compare October 30, 2019 01:12
Not using wrapping in this case is just too inefficient
In some change along the way, I made asScriptContextVariable return an external method instead.
Now that it is fixed, I can use less closure wrapping than before.
@bitwiseman bitwiseman merged commit c9d32ce into jenkinsci:master Oct 30, 2019
@bitwiseman bitwiseman deleted the matrix/method-size branch October 30, 2019 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants