Skip to content

Tree-sitter grammar for Hy, a Lisp-ification of Python.

Notifications You must be signed in to change notification settings

kwshi/tree-sitter-hy

Repository files navigation

tree-sitter-hy

Tree-sitter grammar for Hy, a Lisp-ification of Python.

Quick disclaimer: this is my first attempt at writing a tree-sitter grammar, so some things may be done incorrectly or in non-standard ways. Please help me polish this project up by opening issues/PRs; thanks!

Syntax highlighting screenshots/examples

Advent of Code 2016 day 1 solution in Hy

Installation and usage with nvim-treesitter

Hy is a new-ish language, to the point where Neovim doesn't automatically recognize the .hy extension yet. On top of that, this grammar is in alpha stage (very incomplete, missing lots of features, actively being developed, etc.), so it hasn't been officially added to the nvim-treesitter repository yet. Therefore, using this grammar takes a few extra lines of configuration.

  1. First, register the .hy extension to detect as hy filetype:

    • Example using Neovim's Lua filetype API (requires at least Neovim 0.7):
      vim.filetype.add {
        extension = {
          hy = "hy",
        },
      }
    • Example using old-style autocommand (Vimscript):
      autocmd BufNewFile,BufRead *.hy setfiletype hy
  2. Assuming you already have nvim-treesitter installed, and have already enabled syntax highlighting with nvim-treesitter, register/declare this grammar repository in nvim-treesitter (Lua configuration; adapted from example in nvim-treesitter documentation):

    local parser_config = require "nvim-treesitter.parsers".get_parser_configs()
    parser_config.hy = {
      install_info = {
        url = "https://github.com/kwshi/tree-sitter-hy",
        files = {"src/parser.c"},
        branch = "main",
        generate_requires_npm = false,
        requires_generate_from_grammar = false,
      },
      filetype = "hy",
    }
  3. Restart Neovim, then run :TSInstall hy.

    You'll also need to manually add syntax-highlighting queries to your local installation of Neovim, since :TSInstall only installs the grammar itself, but not the queries. To do so, download the queries/ folder in this repository, and save it to a queries/ folder anywhere in your Neovim runtimepath, e.g., ~/.config/nvim. Example folder structure:

    • ~/.config/nvim
      • init.lua or init.vim, etc.
      • queries
        • hy
          • highlights.scm
  4. Open a Hy file, and syntax highlighting should (mostly) work!

WIP checklist

Currently developed against the Hy version 0.27.0 documentation.

The grammar and highlighting definitions are actively work-in-progress. They're good enough for basic syntax highlighting, but many features are missing. Based on the Hy syntax reference, here is the progress on implementing all Hy syntax features:

  • non-form syntactic elements:
    • shebang
    • whitespace (automatically, using tree-sitter default settings)
    • comments
    • discard prefix
  • identifiers:
    • numeric literals
      • integer
      • float
      • complex
    • keywords
    • dotted identifiers
      • partial support: dotted identifiers are parsed correctly, but dots-only identifiers such as . and ... are currently parsed as dotted identifiers when instead they should be special-cased as symbols.
    • symbols
      • partial support: I think the current definition in the grammar is mostly correct, but a few special cases such as dots-only identifiers are missing.
  • string literals:
    • plain strings (defined, but extremely basic and incomplete; doesn't support escapes)
    • raw strings
    • f-strings
    • bytes strings
    • bracket strings
  • sequential forms:
    • expressions
    • container literals:
      • lists
      • tuples
      • sets
      • dicts
  • additional sugar:
    • quoting/unquoting forms
  • reader macros

Syntax highlighting is determined not only by the grammar, but also by specific patterns occurring in the syntax tree. For example, (return 3) and (print 3) have the same syntactic structure but should be highlighted differently since return is a language keyword/macro, whereas print is just a built-in run-time function. These patterns are extracted/differentiated using tree-sitter queries formulated according to the Hy builtin macros API. Here is a rough progress checklist on support for these special forms/macros:

  • core macros:
    • annotate
    • let
    • variables: setv setx
      • basic support, but missing support for type-annotated setv
    • conditionals: if when cond match
    • loops: for while
    • comprehensions: gfor lfor sfor dfor
    • function definitions: fn defn
      • basic support, but defn query is missing support for optional decorator and annotation components
      • also missing support for async equivalents fn/a, defn/a, etc.
    • class definitions: `defclass
      • minimal/basic support
    • operators:
      • boolean: and or not
      • indexing/key access: get in cut
      • arithmetic: + - * / // += -=
      • comparison: < <= > >= = !=
      • (lots of missing operators)
    • control-flow keywords: return yield
    • chainc
    • (and lots more)
  • Python builtins:
    • functions:
      • sum map abs range len
      • (several missing)
    • types:
      • int str bool
      • (several missing)
    • constants:
      • True False None
      • (others?)

Lots of other things are also still missing from a standard tree-sitter grammar, e.g., other query definitions such as locals.scm, various bindings (?), etc. Please feel free to contribute and help me get this project going!