J. King
0cb64ed689
Add more insertion modes
A bug remains which will require major re-organizaton of code to fix
3 years ago
J. King
68317d838e
Fill in more insertion modes
3 years ago
J. King
fdb63ebedf
Fix tokenizer bug
3 years ago
J. King
dc9dc9953a
Partial adoption agency implementation
3 years ago
J. King
752ab05464
Implement rest of in-body insertion mode
3 years ago
J. King
0cfaced6b3
Fix CDATA tokenizer bugs
3 years ago
J. King
a8d2ee4174
Fill out more of the "in body" insertion mode
This only passes a few morectests because handling of end tags
is still mostly missing
3 years ago
J. King
baaa00e544
Implement a in body
Adoption agency will be handled later
3 years ago
J. King
9f19763cd0
Rewrite ative formatting reconstruction
3 years ago
J. King
400940af36
Fix pushing to the list of active formatting elements
3 years ago
J. King
9758c08da2
Various minor corrections
3 years ago
J. King
1228ecca17
Corrective pass over foreign content stuff
3 years ago
J. King
065f9c97d6
Handle non-foreign fragment cases
3 years ago
J. King
aeb08b5f5d
Fix remaining failures
Fragment-case tests still need to be harnessed to test all functionality
3 years ago
J. King
979cec628e
Overhaul open elements stack
3 years ago
J. King
ab972a838c
Fix DOCTYPE serialization
Also patch all top-level comments out
3 years ago
J. King
d35e4f909e
Correct error in tst harness related to comments
3 years ago
J. King
32f4cca039
Type hints for tree builder properties
3 years ago
J. King
5f1f02b552
Skip tests requiring unimplemented logic
3 years ago
J. King
8e7a0f6284
Clean up foreign content case normalization
3 years ago
J. King
6798c128e4
Correct unknown DOCTYPE checking
3 years ago
J. King
29fd5e2ccb
Fix invalid property accesses
3 years ago
J. King
1dc3d9c23e
Emit whitespace-only character tokens
This makes tree building simpler in certain circumstances
3 years ago
J. King
9c3764da65
Stub of adoption agency
3 years ago
J. King
a8ff431370
Corrective pass over exising insertion modes
3 years ago
J. King
f8b9cf2c2b
Avoid implicit looping and switching
The while loop has been replaced with gotos where appropriate, and
switching has been replaced with a series of if-blocks in line with the
same logic in the tokenizer.
3 years ago
J. King
e3a271f06b
Fix first failure in tree builder
3 years ago
J. King
1fa2f701cb
Update section of tokenization in spec comments
3 years ago
J. King
4e5fd35775
Fix a few tree tests
3 years ago
J. King
ad0a8ae27a
Replace Content-Type parser with proper version
4 years ago
J. King
4e79f378a8
Fix bug uncovered by new tests
4 years ago
J. King
f72809d621
Relax dependence on ctype
4 years ago
J. King
1f3c33ad9e
Better coverage of BOM-based detection
5 years ago
J. King
21c9377b3a
Docblock for BOM detection
5 years ago
J. King
06e43504d0
Tweaks
5 years ago
J. King
164e5ff1e8
Add standard charset detection tests
- Various new tests needed for full coverage, noted in comment
- Byte Order Mark detection methopd added
- Japanese encodings nt yet supported, so tests marked incomplete
- Tests requiring scripting suppressed
5 years ago
J. King
a7e1083681
Prototype character encoding detection
5 years ago
J. King
49f31015ac
Start on character encoding detection
5 years ago
J. King
00bf9974c5
Fix up most error reporting positions
5 years ago
J. King
58a1177888
Address errors and omissions in error emission
One test still fails, though it is arguably immaterial. This does not
account for line and column number, which are known to be mostly
off by one.
5 years ago
J. King
ec199f4f11
Report input stream errors
5 years ago
J. King
9560358021
Character consumption cleanup
- Newline normalization now done on-the-fly
- Consequently, original input string is used as-is
- Byte order mark is not supposed to be skipped
- Use more straightforward method of tracking column position
- Simplify backtracking when spanning
- Genericize character interpretation: this will be expanded to emit
illegal-character parse errors when appropriate
5 years ago
J. King
1ed679c50d
Pass through surrogate characters
This fixes the last four failing tests
5 years ago
J. King
5a12fa8ad7
Tidying
5 years ago
J. King
e8b3c76046
Fix most failures
Also removed assertions
5 years ago
J. King
59456b078f
Fix consuming of overlong entitiy
5 years ago
J. King
e8f35e92fb
Character reference fixes
One test in the "entities.test" file is till failing
5 years ago
J. King
b9b892e6a6
Remove obsolete character reference consumer
5 years ago
J. King
19fb541806
New from-scratch character reference consumer
5 years ago
J. King
67c7f382e2
Prep for character references
- Add missing state constants
- Break all existing deviations for character refs
- Add assertions before use of $attribute
- Also fix DOCTYPE state
5 years ago