J. King
32f4cca039
Type hints for tree builder properties
4 years ago
J. King
5f1f02b552
Skip tests requiring unimplemented logic
4 years ago
J. King
8e7a0f6284
Clean up foreign content case normalization
4 years ago
J. King
6798c128e4
Correct unknown DOCTYPE checking
4 years ago
J. King
29fd5e2ccb
Fix invalid property accesses
4 years ago
J. King
1dc3d9c23e
Emit whitespace-only character tokens
This makes tree building simpler in certain circumstances
4 years ago
J. King
504731cba0
Bring coverage backend up to date
4 years ago
J. King
9c3764da65
Stub of adoption agency
4 years ago
J. King
a8ff431370
Corrective pass over exising insertion modes
4 years ago
J. King
f8b9cf2c2b
Avoid implicit looping and switching
The while loop has been replaced with gotos where appropriate, and
switching has been replaced with a series of if-blocks in line with the
same logic in the tokenizer.
4 years ago
J. King
e3a271f06b
Fix first failure in tree builder
4 years ago
J. King
1fa2f701cb
Update section of tokenization in spec comments
4 years ago
J. King
4e5fd35775
Fix a few tree tests
4 years ago
J. King
bb4002abcb
Stub the tree builder properly
4 years ago
J. King
eea70eccd8
Test harness for tree construction
4 years ago
J. King
a35e8c8ae5
Update character decoders
4 years ago
J. King
ad0a8ae27a
Replace Content-Type parser with proper version
4 years ago
J. King
596a58eff1
Update tooling
4 years ago
J. King
0056e6cbc6
Support PCOV for coverage
4 years ago
J. King
4e79f378a8
Fix bug uncovered by new tests
4 years ago
J. King
269d0ecc64
Patch tests based on input not unstable identifier
4 years ago
J. King
37aecf97ba
Remove scripted encoding test workaround
The test has been segregated, making the workaround unnecessary
4 years ago
J. King
f72809d621
Relax dependence on ctype
5 years ago
J. King
28f0bbfe72
Suppress only one scripting test
5 years ago
J. King
1f3c33ad9e
Better coverage of BOM-based detection
5 years ago
J. King
21c9377b3a
Docblock for BOM detection
5 years ago
J. King
06e43504d0
Tweaks
5 years ago
J. King
164e5ff1e8
Add standard charset detection tests
- Various new tests needed for full coverage, noted in comment
- Byte Order Mark detection methopd added
- Japanese encodings nt yet supported, so tests marked incomplete
- Tests requiring scripting suppressed
5 years ago
J. King
a7e1083681
Prototype character encoding detection
5 years ago
J. King
c1162f962f
Add missing test
5 years ago
J. King
2aa6bb2dea
Remove unnecessary test abstraction
5 years ago
J. King
49f31015ac
Start on character encoding detection
5 years ago
J. King
318d7bd7ad
Patch remaining test failures away
5 years ago
J. King
00bf9974c5
Fix up most error reporting positions
5 years ago
J. King
58a1177888
Address errors and omissions in error emission
One test still fails, though it is arguably immaterial. This does not
account for line and column number, which are known to be mostly
off by one.
5 years ago
J. King
ec199f4f11
Report input stream errors
5 years ago
J. King
9560358021
Character consumption cleanup
- Newline normalization now done on-the-fly
- Consequently, original input string is used as-is
- Byte order mark is not supposed to be skipped
- Use more straightforward method of tracking column position
- Simplify backtracking when spanning
- Genericize character interpretation: this will be expanded to emit
illegal-character parse errors when appropriate
5 years ago
J. King
1ed679c50d
Pass through surrogate characters
This fixes the last four failing tests
5 years ago
J. King
5a12fa8ad7
Tidying
5 years ago
J. King
ff4447e986
Include pending spec changes tests
5 years ago
J. King
e8b3c76046
Fix most failures
Also removed assertions
5 years ago
J. King
59456b078f
Fix consuming of overlong entitiy
5 years ago
J. King
e8f35e92fb
Character reference fixes
One test in the "entities.test" file is till failing
5 years ago
J. King
b9b892e6a6
Remove obsolete character reference consumer
5 years ago
J. King
19fb541806
New from-scratch character reference consumer
5 years ago
J. King
67c7f382e2
Prep for character references
- Add missing state constants
- Break all existing deviations for character refs
- Add assertions before use of $attribute
- Also fix DOCTYPE state
5 years ago
J. King
d4a7280405
Renumber states to match specification sections
5 years ago
J. King
4759f94771
Trim whitespace
5 years ago
J. King
cf41984e88
Fix comment end state
5 years ago
J. King
43f380c1f9
Fix EOF and end tags
- End tags now emit errors if they have attributes
- End tags now emit errors if they are self-closing
- The last character before EOF is now correctly reconsumed
Also changed the tokenizer debug log to be zero-cost
5 years ago