HTML-DOM

Commit Graph

Author	SHA1	Message	Date
J. King	d33929f4a1	Change namespace; add copyright info	4 years ago
J. King	aaf85387be	Remove uses of is_null for consistency	4 years ago
J. King	d319be477e	Improve coverage	4 years ago
J. King	82621a11e3	Sort out namespaced attributes	4 years ago
J. King	329a86b082	Use separate class for null character tokens	4 years ago
J. King	01361efdb8	Various fixes	4 years ago
J. King	6798c128e4	Correct unknown DOCTYPE checking	4 years ago
J. King	1dc3d9c23e	Emit whitespace-only character tokens This makes tree building simpler in certain circumstances	4 years ago
J. King	a8ff431370	Corrective pass over exising insertion modes	4 years ago
J. King	e3a271f06b	Fix first failure in tree builder	4 years ago
J. King	d08438052a	Baseline pass over tokenizer - Implemented missing states (except entity and char ref states) - Re-copied and reformated most text from the specification - Emitted parse errors per spec (except invalid characters) - Properly handled null characters - Passed through invalid characters (these do not yet emit errors) - Added assertions before manipulation of tokens and temporary buffers - Removed problematic optimizations - Reoved explicit continue statements - Allowed end tags to have attributes - Simplified duplicate attribute detection - Corrected DOCTYPE properties not being "missing" - Skipped BOM in encoding-neutral way I may have introduced regressions, and the assertions are mostly serving to mask undefined-variable errors rather than helping to fix them, but at least warnings and notices are not being spammed this way. Work still need to be done in emitting errors for invalid characters (and invalid character sequences), also well as in consuming character references and entities correctly, not to mention general debugging.	5 years ago
Dustin Wilson	85894ed1ea	Fixed an issue with tokenization of attributes	6 years ago
Dustin Wilson	027e5b9f58	Moved tokenizer to its own class • Changed the name of the parser instance variable from Parser::$self to Parser::$instance • Added parse errors for entities into ParseError. • Moved Parser::fixDOM to DOM::fixIdAttributes. • Added an exception for when the tokenizer enters an invalid state (infinite looping). • Made ParseError use Parser::$instance->data instead of a passed around DataStream object.	6 years ago
Dustin Wilson	1fc65f85bd	Started HTML content tree building • Removed html5.php; shouldn't have been there to begin with. • Fixed bug where when feeding ParseError::trigger the wrong number of parameters it wouldn't have the correct exception to throw.	6 years ago
Dustin Wilson	de7cc7cbfa	Fixing foreign content stuff • Changes to the spec since the last edit required a rewrite of the tree building algorithm. • Searching the stack should search from reverse by default because the spec works that way. • Rewrote StartTagToken because the token attributes need to be easily editable as per the spec foreign attributes are edited before the token goes through the element creation process and not after. • Yes, there's a goto. Sue me.	6 years ago
Dustin Wilson	6f74630c98	Begin Implementation of Tree Builder • Added parsing instructions for tokens in foreign content	7 years ago
Dustin Wilson	a89f6c9f09	Beginning Rewrite	7 years ago

17 Commits (d53b9237c4649d178b6c9a8b56df31710002ddc7)