Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Custom Parsers cleanup/expansion #16887

Merged
merged 8 commits into from
Mar 6, 2023
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
100 changes: 64 additions & 36 deletions docs/src/extend/custom-parsers.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,63 +8,54 @@ eleventyNavigation:

---

If you want to use your own parser and provide additional capabilities for your rules, you can specify your own custom parser. If a `parseForESLint` method is exposed on the parser, this method will be used to parse the code. Otherwise, the `parse` method will be used. Both methods should take in the source code as the first argument, and an optional configuration object as the second argument (provided as `parserOptions` in a config file). The `parse` method should simply return the AST. The `parseForESLint` method should return an object that contains the required property `ast` and optional properties `services`, `scopeManager`, and `visitorKeys`.
ESLint custom parsers let you extend ESLint to support linting new non-standard JavaScript language features or custom syntax in your code. A parser is responsible for taking your code and transforming it into an abstract syntax tree (AST) that ESLint can then analyze and lint.

* `ast` should contain the AST.
* `services` can contain any parser-dependent services (such as type checkers for nodes). The value of the `services` property is available to rules as `context.parserServices`. Default is an empty object.
* `scopeManager` can be a [ScopeManager](./scope-manager-interface) object. Custom parsers can use customized scope analysis for experimental/enhancement syntaxes. Default is the `ScopeManager` object which is created by [eslint-scope](https://github.com/eslint/eslint-scope).
* Support for `scopeManager` was added in ESLint v4.14.0. ESLint versions which support `scopeManager` will provide an `eslintScopeManager: true` property in `parserOptions`, which can be used for feature detection.
* `visitorKeys` can be an object to customize AST traversal. The keys of the object are the type of AST nodes. Each value is an array of the property names which should be traversed. Default is [KEYS of `eslint-visitor-keys`](https://github.com/eslint/eslint-visitor-keys#evkkeys).
* Support for `visitorKeys` was added in ESLint v4.14.0. ESLint versions which support `visitorKeys` will provide an `eslintVisitorKeys: true` property in `parserOptions`, which can be used for feature detection.
## Creating a Custom Parser

You can find an ESLint parser project [here](https://github.com/typescript-eslint/typescript-eslint).
If a `parseForESLint` method is exposed on the parser, this method will be used to parse the code. Otherwise, the `parse` method will be used. Both methods should take in the source code as the first argument, and an optional configuration object as the second argument (provided as `parserOptions` in a config file).

```json
{
"parser": "./path/to/awesome-custom-parser.js"
}
```javascript
// TODO: have a simple example here w parse and parserOptions
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to reviewers, would you mind providing some high level direction on what should be included here? perhaps a simple example with a bit of description?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the other example on that page is already quite clear. But you would like to add a new example with parse rather than parseForESLint if I understand correctly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that's correct

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could show just a very simple parser, like one that calls espree and prints a message to the console. This one for example will print the time required to parse each file:

const espree = require("espree");

exports.parse = function (code, options) {
    const label = `Parsing file "${options.filePath}"`;
    console.time(label);
    const ast = espree.parse(code, options);
    console.timeEnd(label);
    return ast; // Only the AST is returned.
};

As I understand it, the only difference is that parse only returns the AST, whereas parseForESLint also returns additional values that let you customize the behavior of the parser even more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it, the only difference is that parse only returns the AST, whereas parseForESLint also returns additional values that let you customize the behavior of the parser even more.

That's correct. The usage is exactly the same. I think @fasttime's example is solid -- anything where parse is calling another parser and somehow manipulating the AST result before returning it would also be appropriate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for the feedback! also TIL about console.time() 💡

```

```javascript
var espree = require("espree");
// awesome-custom-parser.js
exports.parseForESLint = function(code, options) {
return {
ast: espree.parse(code, options),
services: {
foo: function() {
console.log("foo");
}
},
scopeManager: null,
visitorKeys: null
};
};
## `parse` Return Object

```
The `parse` method should simply return the [AST](#ast-specification) object.

### `parseForESLint` Return Object
bpmutter marked this conversation as resolved.
Show resolved Hide resolved

The `parseForESLint` method should return an object that contains the required property `ast` and optional properties `services`, `scopeManager`, and `visitorKeys`.

* `ast` should contain the [AST](#ast-specification) object.
* `services` can contain any parser-dependent services (such as type checkers for nodes). The value of the `services` property is available to rules as `context.parserServices`. Default is an empty object.
* `scopeManager` can be a [ScopeManager](./scope-manager-interface) object. Custom parsers can use customized scope analysis for experimental/enhancement syntaxes. Default is the `ScopeManager` object which is created by [eslint-scope](https://github.com/eslint/eslint-scope).
* Support for `scopeManager` was added in ESLint v4.14.0. ESLint versions which support `scopeManager` will provide an `eslintScopeManager: true` property in `parserOptions`, which can be used for feature detection.
bpmutter marked this conversation as resolved.
Show resolved Hide resolved
* `visitorKeys` can be an object to customize AST traversal. The keys of the object are the type of AST nodes. Each value is an array of the property names which should be traversed. Default is [KEYS of `eslint-visitor-keys`](https://github.com/eslint/eslint-visitor-keys#evkkeys).
* Support for `visitorKeys` was added in ESLint v4.14.0. ESLint versions which support `visitorKeys` will provide an `eslintVisitorKeys: true` property in `parserOptions`, which can be used for feature detection.
bpmutter marked this conversation as resolved.
Show resolved Hide resolved

## The AST specification
## AST Specification

The AST that custom parsers should create is based on [ESTree](https://github.com/estree/estree). The AST requires some additional properties about detail information of the source code.

### All nodes:
### All Nodes

All nodes must have `range` property.

* `range` (`number[]`) is an array of two numbers. Both numbers are a 0-based index which is the position in the array of source code characters. The first is the start position of the node, the second is the end position of the node. `code.slice(node.range[0], node.range[1])` must be the text of the node. This range does not include spaces/parentheses which are around the node.
* `loc` (`SourceLocation`) must not be `null`. [The `loc` property is defined as nullable by ESTree](https://github.com/estree/estree/blob/25834f7247d44d3156030f8e8a2d07644d771fdb/es5.md#node-objects), but ESLint requires this property. On the other hand, `SourceLocation#source` property can be `undefined`. ESLint does not use the `SourceLocation#source` property.
* `loc` (`SourceLocation`) must not be `null`. [The `loc` property is defined as nullable by ESTree](https://github.com/estree/estree/blob/25834f7247d44d3156030f8e8a2d07644d771fdb/es5.md#node-objects), but ESLint requires this property. `SourceLocation#source` property can be `undefined`. ESLint does not use the `SourceLocation#source` property.
bpmutter marked this conversation as resolved.
Show resolved Hide resolved

The `parent` property of all nodes must be rewritable. ESLint sets each node's `parent` property to its parent node while traversing, before any rules have access to the AST.
The `parent` property of all nodes must be rewritable. Before any rules have access to the AST, ESLint sets each node's `parent` property to its parent node while traversing.

### The `Program` node:
### The `Program` Node

The `Program` node must have `tokens` and `comments` properties. Both properties are an array of the below Token interface.
The `Program` node must have `tokens` and `comments` properties. Both properties are an array of the below `Token` interface.

```ts
interface Token {
type: string;
loc: SourceLocation;
range: [number, number]; // See "All nodes:" section for details of `range` property.
// See the "All Nodes" section for details of the `range` property.
range: [number, number];
value: string;
}
```
Expand All @@ -74,8 +65,45 @@ interface Token {

The range indexes of all tokens and comments must not overlap with the range of other tokens and comments.

### The `Literal` node:
### The `Literal` Node

The `Literal` node must have `raw` property.

* `raw` (`string`) is the source code of this literal. This is the same as `code.slice(node.range[0], node.range[1])`.

## Packaging a Custom Parser

TODO: Add info on turning into a package
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to reviewers, would you mind providing some high level direction on what should be included here? perhaps a simple example with a bit of description?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hehe, didn't realize we left a TODO in there.

Basically, what this section should be is how to create an npm package out of your custom parser. I imagine it would simply be:

  1. Create your npm package
  2. Ensure that main in package.json points to your parser file
  3. Publish your npm package
  4. Show how to use the npm package in a config file


## Example

For a complex example of a custom parser, refer to the [`@typescript-eslint/parser`](https://github.com/typescript-eslint/typescript-eslint/tree/main/packages/parser) source code.

A simple custom parser that logs `"foo"` to the console when it processes a node:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we mean by "when it processes a node"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this description is inaccurate. What this is doing is providing a context.parserServices.foo() for rules.

bpmutter marked this conversation as resolved.
Show resolved Hide resolved

```javascript
// awesome-custom-parser.js
var espree = require("espree");
exports.parseForESLint = function(code, options) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question for reviewers: this syntax is a little different from what i'm used to seeing with Node.js.

to my understanding, this is equivalent to the following:

function parseForESLint(code, options){
// function definition
}

module.exports = { parseForESLint }

i think this module.exports = ... pattern is more common.

would we be open to adjusting this example to use the module.exports = ... pattern?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're correct, the two are equivalent. No objections to using the pattern you suggest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding

return {
ast: espree.parse(code, options),
services: {
foo: function() {
console.log("foo");
}
},
scopeManager: null,
visitorKeys: null
};
};

```

Include the custom parser in an ESLint configuration file:

```js
// .eslintrc.json
{
"parser": "./path/to/awesome-custom-parser.js"
}
```