J. King
34eee5fcc3
Rename test case file
6 years ago
J. King
b32b1ec038
Style fixes
6 years ago
J. King
6fd50f0681
Ensure char and byte position never goes beyond the end of the string
6 years ago
J. King
9fba89ebda
Tested seeking
6 years ago
J. King
a99702d4ab
More robust self-synchronization
6 years ago
J. King
c11da3ac6b
Remove now unnecessary data generator
6 years ago
J. King
b871c4f2fd
Implement seeking backward though a string
6 years ago
J. King
ac5e91f843
Restore deleted portion of functional interface
Also added comparative performance measurement
6 years ago
J. King
1ed3c36a65
Start on alternate object-based interface
This is both simpler, and slightly faster, yielding between 2% and 5% faster performance
6 years ago
J. King
69a194ecf8
More useful performance test output
7 years ago
J. King
a441fc6a95
CS fix
7 years ago
J. King
3698aa8d8d
Tweaks and cleanup
7 years ago
J. King
84d103269f
30% improvement in performance for multibyte characters
7 years ago
J. King
e755699dd7
Changed performance test data
7 years ago
J. King
3aaaae0c74
More performance improvements, and a regression fix
7 years ago
J. King
3cb49bbc77
Further performance improvements
7 years ago
J. King
6a97da7435
Reduced number of performace tests
7 years ago
J. King
cd68883d07
Add a performance profiling script
7 years ago
J. King
aa58f619d7
Optimize for ASCII characters in ord()
This yields a 60% performance improvement on a typical HTML document
7 years ago
J. King
434e41cc2c
Initial round of decoding tests, with one fix
7 years ago
J. King
b725fddc6c
Clean up Robofile
7 years ago
J. King
9062f4e6a6
Add infrstructure required for tests
7 years ago
J. King
aa0d6ce20e
Split off UTF-8 tools from URL parser
7 years ago
J. King
30162e8525
Correct deficiencies in UTF-8 handling
Function now operates as defined by the WHATWG encoding standard; the practical implications of this are that:
- More invalid sequences are correctly identified as invalid
- Overlong encodings are normalized
- ord() and chr() functions have been added as a consequence of this work
7 years ago
J. King
7d13a6c3b7
Four more states
7 years ago
J. King
80975d595e
Implement relative state; slight refactor
7 years ago
J. King
fd8c333a68
Split off UTF-8 processing into its own class, greately expanded
Also simplified some parts of the algorithm implementation
Part of this simplification involves the use of goto statements
7 years ago
J. King
42dfd0171f
Process UTF-8 characters rather than single bytes
7 years ago
J. King
23fd5872f6
Minor clarifications
7 years ago
J. King
9786d25aa5
Initial commit with a few states; not yet tested
7 years ago