tokenizer

A grammar describes the syntax of a programming language, and might be defined in Backus-Naur form (BNF). A lexer performs lexical analysis, turning text into tokens. A parser takes tokens and builds a data structure like an abstract syntax tree (AST). The parser is concerned with context: does the sequence of tokens fit the grammar? A compiler is a combined lexer and parser, built for a specific grammar.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tokenizer

Here are 1,092 public repositories matching this topic...

ppaanngggg / token-counter

clipperhouse / uax29.net

Chevrotain / chevrotain

MrTechyWorker / chartokenizer

BLKSerene / Wordless

roshan-research / hazm

ByteXenon / The-Tiny-Lua-Compiler

aryagxr / llm-from-scratch

CompLin / nheengatu

adriweb / tivars_lib_cpp

gbenson / dom-tokenizers

cahya-wirawan / rwkv-tokenizer

gtoffoli / spacy-cameltokenizer

ByteXenon / MathParser.lua

ldaniels528 / oxide

Dadmatech / DadmaTools

bzp2010 / lua-tiktoken

lindera-morphology / lindera

openshieldai / openshield

Hk669 / bpetokenizer

Related Topics