• Before the first pattern's regex to match the line would be processed into tokens. This apparently is incorrect. Instead, the pattern regex that has an offset that is closest to the offset wins. Changes reflect this.
• Added a flag for begin patterns
• Trying to handle begin/end patterns better. Begin patterns shouldn't automatically remove themselves from the stack, their corresponding end pattern should instead.
• Lines are now converted to UTF-32 while tokenizing so that byte
offsets may be cleanly converted to character offsets
• Now when grammars are parsed into Grammar objects begin and end
matches are converted to regular matches by adding end matches to the
pattern's pattern list to simplify tokenization.
• Highlight::withFile and Highlight::withString now accept an encoding
parameter which defaults to UTF-8.