Semantic Errors Detection
Overview
Semantic errors are not picked up by the tree-sitter parser and result in a valid CST but an invalid AST. They are detected at the stage of parsing the CST into a Rust Model object representing the Essence problem. The errors generated from the Essence parser are extracted and reported via Diagnostics API.
The semantic errors detected include type errors, omitted declarations, invalid indexing and using keywords as identifiers.
Implementation
pub fn detect_semantic_errors(source: &str) -> Vec<Diagnostic>
Calls parse_essence_with_context from parse_model.rs which parses Essence into a Model using the CST generated by the tree-sitter parser, converts the returned errors (EssenceParseError) into Diagnostics if they exist and accumulates them in a Vec<Diagnostic>.
fn error_to_diagnostic(err: &crate::errors::EssenceParseError) -> Diagnostic
The function converts an EssenceParseError into a Diagnostic by pattern-matching on the error variant. When an error includes an associated source range and message, the optional tree_sitter::Range is converted into two Position structures using range_to_position(range), and a Diagnostic is created with this range, an error severity, semantic error detection as the source, and an appropriate error message. For all other error variants, a fallback Diagnostic is produced.
fn range_to_position(range: &Option<tree_sitter::Range>)
Maps an Option<tree_sitter::Range> to (Position, Position). If the range is None it returns (0,0)-(0,0).
Keywords as Identifiers Check
Tree-sitter does not report an error when an Essence keyword (e.g. find, bool) is used as an identifier, because such keywords are valid character sequences according to the grammar’s identifier token. As a result, the generated CST is valid, and this cannot be detected during syntactic error checking. To address this, an additional check is implemented during the CST-to-Model parsing stage.
#![allow(unused)]
fn main() {
const KEYWORDS: [&str; 21] = [
"forall", "exists", "such", "that", "letting", "find", "minimise", "maximise", "subject", "to",
"where", "and", "or", "not", "if", "then", "else", "in", "sum", "product", "bool",
]
}
fn keyword_as_identifier(root: tree_sitter::Node, src: &str) -> Result<(), EssenceParseError>
Called at the end of parse_model_with_context. Performs this validation by traversing the CST using DFS. Starting from the root node, it iterates over all child nodes and inspects nodes of kind variable, identifier, or parameter. For each such node, the corresponding source text is extracted and compared against a predefined list of reserved Essence keywords. If a match is found, the function reports an EssenceParseError with a source range derived from the CST node and an error message.
How To Test
cargo test -p conjure-cp-essence-parser --test semantic_test
cargo test -p conjure-cp-essence-parser --test keyword_as_ident
Examples
Example: Keyword find as an identifier
Input
find find,b,c: int(1..3)
Diagnostic
Range: (0:5 - 0:9)
Severity: Error
Message: Semantic Error: Keyword 'find' used as identifier
Source: "semantic error detection"
Example: Keyword bool as an identifier
Input
find bool: bool
Diagnostic
Range: (0:5 - 0:9)
Severity: Error
Message: Semantic Error: Keyword 'bool' used as identifier
Source: "semantic error detection"
Example: Omitted Declaration
Input
find x: int(1..10)
such that x = y
Diagnostic
Range: (1:14 - 1:15)
Severity: Error
Message: Semantic Error: Undefined variable: 'y'
Source: "semantic error detection"