Commit graph

33 commits

Author SHA1 Message Date
e3a271f06b Fix first failure in tree builder 2021-02-13 10:29:07 -05:00
bb4002abcb Stub the tree builder properly 2021-02-12 22:46:10 -05:00
eea70eccd8 Test harness for tree construction 2021-02-12 21:05:48 -05:00
a35e8c8ae5 Update character decoders 2021-02-12 09:51:30 -05:00
269d0ecc64 Patch tests based on input not unstable identifier 2020-09-17 09:10:32 -04:00
37aecf97ba Remove scripted encoding test workaround
The test has been segregated, making the workaround unnecessary
2020-09-16 18:23:39 -04:00
28f0bbfe72 Suppress only one scripting test 2019-12-23 10:17:07 -05:00
1f3c33ad9e Better coverage of BOM-based detection 2019-12-23 09:13:08 -05:00
06e43504d0 Tweaks 2019-12-22 23:38:15 -05:00
164e5ff1e8 Add standard charset detection tests
- Various new tests needed for full coverage, noted in comment
- Byte Order Mark detection methopd added
- Japanese encodings nt yet supported, so tests marked incomplete
- Tests requiring scripting suppressed
2019-12-22 22:51:18 -05:00
a7e1083681 Prototype character encoding detection 2019-12-22 13:36:59 -05:00
c1162f962f Add missing test 2019-12-21 19:28:48 -05:00
2aa6bb2dea Remove unnecessary test abstraction 2019-12-21 15:05:42 -05:00
49f31015ac Start on character encoding detection 2019-12-21 14:53:51 -05:00
318d7bd7ad Patch remaining test failures away 2019-12-20 11:48:14 -05:00
00bf9974c5 Fix up most error reporting positions 2019-12-19 22:28:11 -05:00
58a1177888 Address errors and omissions in error emission
One test still fails, though it is arguably immaterial. This does not
account for line and column number, which are known to be mostly
off by one.
2019-12-19 15:13:20 -05:00
5a12fa8ad7 Tidying 2019-12-17 17:08:19 -05:00
ff4447e986 Include pending spec changes tests 2019-12-17 13:58:54 -05:00
e8f35e92fb Character reference fixes
One test in the "entities.test" file is till failing
2019-12-16 23:41:44 -05:00
67c7f382e2 Prep for character references
- Add missing state constants
- Break all existing deviations for character refs
- Add assertions before use of $attribute
- Also fix DOCTYPE state
2019-12-15 22:20:20 -05:00
43f380c1f9 Fix EOF and end tags
- End tags now emit errors if they have attributes
- End tags now emit errors if they are self-closing
- The last character before EOF is now correctly reconsumed

Also changed the tokenizer debug log to be zero-cost
2019-12-15 19:45:59 -05:00
4e4aee2edd Update intl dependency 2019-12-13 12:13:44 -05:00
6b42f08fbc Change some if-the-exception blocks to assertions
This has only been done some parts of the code that are internal
to the parser at large.
2019-12-12 17:35:24 -05:00
af57117c23 Silence parse errors for now 2019-12-12 15:43:16 -05:00
bb2a7b5a95 Rewrite how parse errors are handled
Everything which can emit a parse error should have the error handler
and data stream as properties and use the ParseErrorEmitter trait to
avoid complicating the task of actually producing an error.

Normally the Parser would be expected to set the error handler before it
begins (this commit does not do this) and unset it after it's done.
Alternatively, the entire means of reporting errors can now be easily
replaced.
2019-12-12 15:23:15 -05:00
d93fe25e58 Combine character tokens in test harness 2019-12-12 10:48:11 -05:00
1beb934789 Add more tests 2019-12-11 20:49:24 -05:00
f360206a34 Basic endless loop helper 2019-12-10 23:20:50 -05:00
1386eb103c Fix test transformer 2019-12-10 21:35:27 -05:00
1971892635 Basic skeleton of test suite 2019-12-10 18:00:08 -05:00
9df201f663 Remove erroneously added sub-repository 2018-08-02 11:31:59 -04:00
0d7a0a3367 Change tokenizer constant references to self::
Changing static:: to self:: makes constant de-referencing a
compile-time operation rather than run-time, potentially improving
performance. As constants cannot be overridden by extending classes,
there is no advantage to using static:: for these constants
2018-08-02 11:30:12 -04:00