- All Implemented Interfaces:
Lexer
public class SimpleRegexLexer
extends java.lang.Object
implements Lexer
This is a "dynamic" Lexer that will use Regex patterns to parse any document,
It is NOT as fast as other JFLex generated lexers.
The current implementation is about 20x slower than a JFLex lexer
(5000 lines in 100ms, vs 5ms for JFlex lexer)
This is still usable for a few 100 lines. 500 lines parse in about 10ms.
It also depends on how complex the Regexp and how many of them will actually
provide a match.
Since KEYWORD TokenType is by order less than IDENTIFIER, the higher
precedence of KEYWORD token will be used, even if the same regex matches
an IDENTIFIER. This is a neat side-effect of the ordering of the TokenTypes.
We now just need to add any non-overlapping matches. And since longer matches
are found first, we will properly match the longer identifiers which start with
a keyword.
This behaviour can easily be modified by overriding the compareTo
method