CST - Concrete Syntax Tree. Generated by the tree-sitter parser when parsing an essence file.
AST - Abstract Syntax Tree. Generated when the CST is parsed and Model struct is created.
Syntax Errors
Surface level mistakes in the Essence format.
Lead to an invalid CST. They are caught by the tree-sitter when an Essence expression does not match any rules in the current grammar and cause the CST to have ERROR nodes.
Detected by inspecting and contextualising the CST.
1. Missing Token
Description
Expected token is absent.
Detection
MISSING node inserted or zero range node in the CST.
Example 1: Missing identifier(s)
find: bool
Example 2: Missing right operand
find x: int
such that x /
Example 3: Missing domain in tuple
find x: tuple()
2. Unexpected Token
Description
Symbol(s) that are not supposed to be there according to the grammar rule tree-sitter is matching. Can be tokens belonging to the grammar or any foreign symbols.
Detection
ERROR node wrapping the unexpected symbol(s).
Example 1: Extra ) at the end
find x: int(1..2))
Example 2: Unexpected % in implication
find x: int(1..3)
such that x -> %9
Example 3: Unexpected & inside matrix domain
find x: matrix indexed by [int, &] of int
3. Malformed Top Level Statement
Description
A line that cannot be parsed by the tree-sitter using any of the grammar rules.
Detection
ERROR node that is a direct child of the root node program and precedes the nodes if the parsed line.
Example 1: Invalid print x
find x: int(1..5)
print x
Example 2: Malformed find statement
More of an edge case. Even though the keyword find is present, tree-sitter does not parse the line using the find_statement rule since the first symbol after find cannot be parsed an identifier.
find +,a,b: int(1..3)
4. General Syntax Error
Cases that cause an error in the CST but did not fall into any of the previous categories.
Semantic Errors
These errors concern the logical context of an expression rather than its surface-level formatting. As a result, the expression may conform to the grammar and produce a valid CST, but still fail to parse into a valid AST. Such errors are detected during the phase that converts the CST into a Model struct.
1. Keywords as Identifiers
Description
Using grammar keywords (e.g find, bool) as a variable name.
Detection
Compare the variable names against the set of keywords that should not be allowed.
Example 1: Keyword bool as an identifier
find bool: bool
Example 2: Keyword letting used as an identifier
find letting,b,c: int(1..3)
2. Omitted Declaration
Description
Variable used but not declared.
Detection
Checking that the identifier used in a constraint was previously declared in a find or letting_statement.
Example: x was not declared before
find y: int (1..4)
such that x = 5
3. Invalid Domain
(not yet supported)
Description
Logically or mathematically invalid bounds.
Example: a bigger value before smaller
find x: int(10..5)
4. Type Mismatch
Description
Attempt to do an operation on illegal types.
Detection
Enforcing types on operations and catching a mismatch between expected types and node kind that is being parsed.
Example: Adding an integer and boolean
letting y be true
find x: int (5..10)
such that 5 + y = 6
5. Unsafe division
(not yet implemented)
Description
Attempt to divide/modulo by zero.
Example: Divide by zero
find x: int(5..10)
such that x/0 = 3
At the moment says no solutions found. Would want to explicitly disallow it.
6. Invalid indexing
Description
Tuples and matrices are indexed from 1. Negative, zero or index out of bounds is invalid.
Example: s tuple index of out bounds
letting s be tuple(0,1,1,0)
letting t be tuple(0,0,0,1)
find a : bool such that a = (s[5] = t[1]) $ true