Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Push var down to first variable use when possible #5395

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

edemaine
Copy link
Contributor

@edemaine edemaine commented Jan 24, 2022

This is a portion of my typescript branch that I spent the last few days working on, implementing the plan laid out in #5377 (comment) : push the var declaration of a variable down to its first clean assignment if there is one. The result is generally cleaner code (many fewer variables declared in top var block), and much better automatic types when the code is presented to TypeScript. It's a step toward #5377.

Examples of clean assignments:

  • x = 7var x = 7
  • [x, y] = arrayvar [x, y] = array
  • {x, y} = pointvar {x, y} = point
  • [{x, y}, {a, b}] = [point, pair]var [{x, y}, {a, b}] = [point, point2];

Unexamples of unclean assignments (so var won't try to be inserted here; if this is the only assignment, var will remain at top):

  • [{x, y}, {x: x2, y: y2}] = [point, point2]
  • {x = 5} = obj
  • x += 5
  • if match = re.exec string
  • console.log x = 5

Note for reviewer: "clean" is referred to as isDeclarable in the code.

Only the first assignment to a variable is modified. The only exception is if there is a joint assignment that has some undeclared variables in addition to some already declared variables. Some examples:

CoffeeScript JavaScript output from this PR
x = 5
x = 10
var x = 5;
x = 10;
x = 5
[x, y] = array
var x = 5;
var [x, y] = array;

While the duplicate var isn't pretty, I think it's rare, and it's allowed. eslint would complain about the output, but eslint complains about all sorts of things in CoffeeScript's output; it's better to run eslint on the CS AST.

There's also special handling of for loops and ranges. (This took the longest to get right.) Examples: (comments added for explanation)

CoffeeScript JavaScript output from this PR
for x from it
  console.log x
for (var x of it) {
  console.log(x);
}
for x, y of object
  console.log x, y
for (var x in object) {
  var y = object[x];
  console.log(x, y);
}
for x in [1..100]
  console.log x
for (var i = 1, x = i; i <= 100; x = ++i) {
  console.log(x);
}
for x in array
  console.log x
for (var i = 0, len = array.length; i < len; i++) {
  var x = array[i];
  console.log(x);
}
for x, i in array
  console.log x
for (var j = 0, i = j, len = array.length; j < len; i = ++j) {
  var x = array[i];
  console.log(x);
}
squares =
  for x in array
    x ** 2
var x;  // needs to stay here so visible outside loop

var squares = (function() {
  var results = [];
  for (var i = 0, len = array.length; i < len; i++) {
    x = array[i];
    results.push(x ** 2);
  }
  return results;
})();
powers =
  for x, i in array
    x ** i
var i, x;  // needs to stay here so visible outside loop

var powers = (function() {
  var j, len;  // can't declare in for loop or would shadow i
  var results = [];
  for (j = 0, i = j, len = array.length; j < len; i = ++j) {
    x = array[i];
    results.push(x ** i);
  }
  return results;
})();

As you can see in the more complicated examples, there are some tricky issues where it's crucial to avoid shadowing by excess var. I think I got them all right; it was very hard to pass all the tests. But it's possible that more tests should be done/added.

This PR also refactors Scope to ensure all variables have positions mapping,
so we can do faster searching for variables, and a new get method for retrieving various data about a variable. This makes it easier to keep track of whether a variable has already had a var declaration during some assignment.

Also refactor Scope to ensure all variables have `positions` mapping,
so we can do faster searching for variables.
@edemaine
Copy link
Contributor Author

There are some browser tests failing, caused by this bug in babel-plugin-minify-dead-code. I guess var is tricky to implement after all. 😦

phil294 added a commit to phil294/coffeesense that referenced this pull request Jan 25, 2022
less error-prone, and now also supports more complex use cases such as loops and destructuring assignments. This was possible by switching the CoffeeScript compiler to a recent contribution by @edemaine at jashkenas/coffeescript#5395
Copy link
Collaborator

@GeoffreyBooth GeoffreyBooth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pulled down this branch and rebuilt the documentation and read those examples. The JavaScript output is much more readable! This is quite an improvement!

I reviewed this mostly for style; I almost don’t know how to review it more deeply than that. The scope and variable tracking parts of this codebase feel the most incomprehensible to me; I know they have comments explaining the logic at a surface level, but the level of documentation feels inadequate for the scale of the complexity here. Whatever extra detail that you can add now that you’ve dived deeply into it and understand it better than I do would be beneficial to future contributors and maintainers.

I merged in latest main and rebuilt, and pushed the new commits on your branch. The Node tests pass but I see 9 failures in the browser tests. I saw the note about the Babel plugin bug, but unless you want to push a PR to fix it for them we might need to find to find another solution. Maybe we can disable that plugin somehow while still minifying somehow, either still via Babel or perhaps via a different minification library?

Comment on lines +143 to +148
(v.name for v in @variables when v.type is 'var' and not v.laterVar and
switch assigned
when true then v.assigned?
when false then not v.assigned?
else true
).sort()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not following what this inline switch is doing. This?

Suggested change
(v.name for v in @variables when v.type is 'var' and not v.laterVar and
switch assigned
when true then v.assigned?
when false then not v.assigned?
else true
).sort()
@variables.filter((v) => v.type is 'var' and not v.laterVar and
((assigned and v.assigned?) or (not assigned and not v.assigned?))
.map((v) => v.name).sort()

And why all these particular conditions?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn’t that switch just assigned == v.assigned??

Copy link
Contributor Author

@edemaine edemaine Jan 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The switch deals with three cases: assigned can be

  • true meaning "only give me assigned variables" i.e. when v.assigned?
  • false meaning "only give me unassigned variables" i.e. when not v.assigned?
  • undefined (or absent argument) meaning "give me all variables" i.e. when true.

I forget whether the undefined case is actually used; if it isn't, we could definitely simplify. But I don't think either simplification above is correct according to the 3-case spec.

A cleaner version for all 3 cases would probably be this:

(for v in @variables when v.type is 'var' and not v.laterVar
  continue if assigned? and assigned != v.assigned?
  v.name
).sort()

Thanks @GeoffreyBooth for the thorough review! I will work on responding and adding a bunch of comments.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be going a bit far to do this all in a one-liner. You could maybe have a one-liner that retrieves all variables, then something like “if give me only assigned variables, filter the collection to just those; else if only give me unassigned variables, filter the collection to just those.” Then sort whatever’s left in the collection at this point and return it. Making this a multiline function also gives you opportunities to add comments explaining each branch and why you might want each.

Comment on lines +2354 to +2356
# Can do var declaration only if all relevant variables are in this scope.
# If we declare @fromVar, @toVar, or @stepVar, they're guaranteed
# to be in scope (generated in @compileVariables), so don't check those.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments become the annotated source, so please always use curly quotes and backticks, and proper English:

Suggested change
# Can do var declaration only if all relevant variables are in this scope.
# If we declare @fromVar, @toVar, or @stepVar, they're guaranteed
# to be in scope (generated in @compileVariables), so don't check those.
# We can output `var` declarations only if all relevant variables are in this scope.
# If we declare `@fromVar`, `@toVar`, or `@stepVar`, theyre guaranteed
# to be in scope (generated in `@compileVariables`), so dont check those.

And so on for all the comments. The logic you’re adding in this PR is quite difficult to follow (at least by me) and excessive documentation of what it’s doing would be really good to have.

o.scope.laterVar @fromVar if @fromC isnt @fromVar
o.scope.laterVar @toVar if @toC isnt @toVar
o.scope.laterVar @stepVar if @stepC isnt @stepVar
varPart = if declare then "var " else ''
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
varPart = if declare then "var " else ''
declarationPrefix = if declare then 'var ' else ''

In case this ever becomes const.

Comment on lines +2380 to +2381
#if known and not namedIndex or o.scope.laterVar idx
# varPart = "var #{varPart}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be removed?

Comment on lines +3349 to +3350
exportDefault = @ instanceof ExportDefaultDeclaration
if not exportDefault and
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

makeDeclaration: (o) ->
# Consider adding 'var' prefix to this assignment. Only works at top level
# and if LHS is a valid declaration, and assignment is '=' not another op.
return unless o.level == LEVEL_TOP and not o.undeclarable and
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return unless o.level == LEVEL_TOP and not o.undeclarable and
return unless o.level is LEVEL_TOP and not o.undeclarable and

@@ -3596,11 +3652,13 @@ exports.Assign = class Assign extends Base
return compiledName.concat @makeCode(': '), val

answer = compiledName.concat @makeCode(" #{ @context or '=' } "), val
# Per https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Destructuring_assignment#Assignment_without_declaration,
if @declaration
[@makeCode("var "), ...answer]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[@makeCode("var "), ...answer]
[@makeCode('var '), ...answer]

@@ -16,7 +16,6 @@ that should be output as part of variable declarations.

constructor: (@parent, @expressions, @method, @referencedVars) ->
@variables = [{name: 'arguments', type: 'arguments'}]
@comments = {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that this is wrong, but why is this removed? How are we tracking comments now?

Comment on lines +80 to +81
Gets a variable and its associated data from this scope (not ancestors),
or `undefined` if it doesn't exist.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What’s its associated data? Its properties?

Comment on lines +86 to +90
Gets the type of a variable declared in this scope,
or `undefined` if it doesn't exist.

type: (name) ->
return v.type for v in @variables when v.name is name
null
@get(name)?.type
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this isn’t code that you added, but what’s a variable’s “type”? Can we explain that in this comment?

@STRd6
Copy link
Contributor

STRd6 commented Apr 18, 2022

I'm also excited about this, anything I can do to help?

@phil294
Copy link

phil294 commented May 15, 2022

I noticed the following translation, not mentioned above or in the test cases:

[ x = '', y = '' ] = []
x = x.slice(0)

becomes

var y;

[x = '', y = ''] = [];

var x = x.slice(0);

is this a bug or intended? If I didn't read you wrong, this should have been var x, y instead.

@edemaine
Copy link
Contributor Author

edemaine commented May 15, 2022

@phil294 I can see why that would happen. The code must be interpreting var [x = '', y = ''] = []; as an invalid declaration, so it doesn't do var there, leaving x and y needing declaration elsewhere. x = x.splice(0) is a valid place to put the x declaration and so it puts it there. y doesn't have an assignment so it gets hoisted. This is all valid thanks to automatic var hoisting.

However, var [x = '', y = ''] = []; is a valid declaration, so I should tweak the test to just put the var there. Some assignments are not valid places to put var though (for example, if x = 5); in those cases, the next assignment to x will be where the var ends up. We could alternatively decide that, if the first assignment isn't a valid place for var, then the var always gets hoisted to the top -- but that's not how this branch works currently.

@phil294
Copy link

phil294 commented Sep 11, 2022

@edemaine Thanks for the clarification!


As you know, this PR branch has been in use in CoffeeSense for several months now, and indeed successfully so.

Today however, I have found an actual bug for the first time (I think). It does not happen with the normal exports, but only with the browser compiler version. Here's reproduction code, which I ran with Deno because Node/modules gave some cryptic errors which I couldn't be bothered with.

import { compile } from "./node_modules/coffeescript/lib/coffeescript-browser-compiler-modern/coffeescript.js"
let coffee = 'a = 1\n[1..a]'
let response = compile(coffee, { sourceMap: true, bare: true })
console.log(response.js)

output:

var a = 1;

(function() {
  var results = [];
  for (var i = 1; 1 <= a ? i <= a : i >= a; 1 <= a ? i++ : i--){ results.push(i); }
  return results;
}).apply(thisundefined);

note the thisundefined at the end.

Please, there is no need to investigate, this is a rare case with low priority for me or any user, I think. I just thought it might be helpful for developing this branch, should you ever get back to it - maybe it's also already covered above.

Thank you again for your efforts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants