HTML DOM serialization tests ============================ The format of these tests is essentially the format of html5lib's tree construction tests in reverse. There are, however, important differences, so the format is documented in full here. Each file containing tree construction tests consists of any number of tests separated by two newlines (LF) and a single newline before the end of the file. For instance: [TEST]LF LF [TEST]LF LF [TEST]LF Where [TEST] is the following format: Each test begins with a line reading `#document` or `#fragment`; subsequent lines represent the document or document fragment (respectively) used as input, until a line is encountered which reads `#output`, `#script-on`, or `#script-off`. Each DOM node in the input is written on its own line beginning with the characters "| " (a vertical bar followed by a single space); lines which begin with other characters are a continuation of the previous line. Attributes are treated as distinct nodes and have their own entries. There is no escape mechanism: all input is literal, including newlines and quotation marks. Two spaces are used to denote each level of nesting. For example: | node | child node continuation of child node | grandchild node | child node | attribute node of child | grandchild node The different types of nodes are: - Element nodes in the form `` for an element in the HTML namespace, or `` for an element in a foreign namespace. Qualified names are written as usual e.g. ``, though such elements are not produced by the parser - Attribute nodes in the form `id="value"` or e.g. `xml xml:id="value"`, with a quotation mark immediately followed by a newline marking the end of the attribute value (in other words, attribute values may contain literal quotation marks) - Text nodes in the form `"text data"`; like attributes, only a quotation mark followed a newline marks the end of text data - Comment nodes of the form ``; the space characters are padding and are not part of the comment data - Document type nodes in the form ``, or `` or simply `` depending on its contents - Processing instructions in the form ``. Processing instructions are not generated by the HTML parser, but may appear in documents by other means Namespaces are represented by the following short names: | Name | URL | |-------|--------------------------------------| | xml | http://www.w3.org/XML/1998/namespace | | xmlns | http://www.w3.org/2000/xmlns/ | | xlink | http://www.w3.org/1999/xlink | | math | http://www.w3.org/1998/Math/MathML | | svg | http://www.w3.org/2000/svg | Other namespaces may also appear; these should be interpreted as literal URLs. After the input block either `#script-on` or `#script-off` may appear. These signal that the test should be run with scripting on or off, respectively. If neither line is present, the test should be run in both modes. Finally, `#output` marks the beginning of output. All subsequent text is literal characters until two consecutive newlines following by either `#document` or `#fragment` are seen. Below is a complete example: #document | | | | lang="en" | | | style="font-family: "Times New Roman"" | | xml xml:id="image" |
| "This is a text node. It has an embedded newline. It is in fact pretty "busy" and has multiple newlines. And even a blank line." |