MensBeam/Lit - Lit

Author	SHA1	Message	Date
Dustin Wilson	8590b6cde9	More more documentation	3 years ago
Dustin Wilson	a311c83240	More documentation	3 years ago
Dustin Wilson	fd4124bc3e	Started documenting the library in the readme	3 years ago
Dustin Wilson	641e9fa547	Added some documentation	3 years ago
Dustin Wilson	e2aa1fa1ed	Injections have been completely wrong, attempting rewrite	3 years ago
Dustin Wilson	f944ca9b9c	Added DOM creation • Fixed bug where nonexistent grammars would cause tokenizer to fail. • Added mensbeam/html as a dependency, removed docopt/docopt and ext-mbstring. • Discovered bug when injections are removed from the stack when tokenizing, investigating.	3 years ago
Dustin Wilson	434c29a03f	Tokenization looks to be working... VERY SLOWLY	3 years ago
Dustin Wilson	c04e54d5ed	Minor tokenization bug fixes • When calculating the offset after handling overlapping tokens it now aware of invalid capture offsets (meaning they matched nothing). • Tokenizer::tokenizeLine now correctly does not continue looking for new matches when the newly tokenized pattern was an end pattern. • Grammars no longer have beginCaptures incorrectly applied to end patterns.	3 years ago
Dustin Wilson	98bc4ff794	Started adding assertions for easier debugging of Tokenizer	3 years ago
Dustin Wilson	730f10ead3	Minor changes	3 years ago
Dustin Wilson	b2ae3be4a7	Subpattern tokenization now maintains line length	3 years ago
Dustin Wilson	6eccc22196	Minor fixes, added capture token splicing	3 years ago
Dustin Wilson	cba093dd68	Removed getting data from file • Added pattern match anchor support. • Data is now an instanced class with support only for string input. • Data now has firstLine, lastLine, and lastLineBeforeFinalNewLine properties to facilitate anchoring • Highlight now has a static toDOM method for highlighting to a DOM tree instead of the withFile and withString methods for accepting different kinds of input • Tokenizer now only outputs newline tokens if not the last line • Tokenizer now throws out pattern match regexes if their anchors are invalid for the current line. • Tokenizer now won't mistakenly emit empty string tokens.	3 years ago
Dustin Wilson	c055e9f3ba	Fixed tokenization of pattern and capture leftovers	3 years ago
Dustin Wilson	088a28270b	Progress	3 years ago
Dustin Wilson	adf7cd7331	Misunderstood matching process, still broken lol • Before the first pattern's regex to match the line would be processed into tokens. This apparently is incorrect. Instead, the pattern regex that has an offset that is closest to the offset wins. Changes reflect this.	3 years ago
Dustin Wilson	2577852090	Still broken • Added a flag for begin patterns • Trying to handle begin/end patterns better. Begin patterns shouldn't automatically remove themselves from the stack, their corresponding end pattern should instead.	3 years ago
Dustin Wilson	2f7f14dea1	Scope names now resolve, starting on first line and last line anchors	3 years ago
Dustin Wilson	63a5fb7367	One full line tokenizes lol	3 years ago
Dustin Wilson	4ed8ffcd26	Reverting to using UTF-8 and preg_match. mb_ereg is garbage	3 years ago
Dustin Wilson	5a3322a0cb	Many changes • Lines are now converted to UTF-32 while tokenizing so that byte offsets may be cleanly converted to character offsets • Now when grammars are parsed into Grammar objects begin and end matches are converted to regular matches by adding end matches to the pattern's pattern list to simplify tokenization. • Highlight::withFile and Highlight::withString now accept an encoding parameter which defaults to UTF-8.	3 years ago
Dustin Wilson	1763653eca	Tokenizing stuff... maybe? :)	3 years ago
Dustin Wilson	457cf39a56	Move Grammar\Registry to GrammarRegistry	3 years ago
Dustin Wilson	dcb00c001f	Changed Pattern to Rule to be consistent with other implementations	3 years ago
Dustin Wilson	785c03b1f8	Trying to figure out structure	3 years ago
Dustin Wilson	5edc6d32b3	Changed project name to Lit lol	3 years ago

26 Commits (25eb7ef79a784bd7fb42727dd65cfb7ca44b0293)