<%@ page language="java" %><%@ taglib uri="/WEB-INF/struts-i18n.tld" prefix="i18n" %> "> ]> Jericho HTML Parser Test Document

Test HTML Document

This document contains many elements with optional end tags, server-side tags and some common illegal HTML constructs to demonstrate how they are interpreted by the parser.

Table Example
First ColumnSecond Column

Cell 1

This is a table within the table
Second row of inner table
Third row of inner table

Note that the parser does not consider this text to be a part of the paragraph started before the table because according to the HTML specification a TABLE, being a block-level element, must terminate the P element. See the documentation in HTMLElementName.P for more information, including instructions on how to make this parser compatible with the default behaviour of all major browsers in HTML transitional mode.


The following text demonstrates the use of a CDATA section which has limited browser compatability

example of markup that is not to write with < and such. ]]>
This is preformatted text
whose formatting should not be altered.

This paragraph contains a comment. This text is a continuation of the paragraph before the one that is commented out.

This text contains incorrectly nested formatting tags which is

quite commonly generated by HTML editors.

This section demonstrates the consequences of illegally nesting block-level elements inside inline-level elements, which is a very common situation caused by the misuse of FONT elements by HTML editors.

This paragraph starts inside the Arial FONT element. This text occurs after the Arial FONT end tag, but is still considered to be part of the same paragraph.

  • This entire list is surrounded
  • by an Arial font element.

Limitations when dealing with tags located inside other tags

This section demonstrates the limitation of the library in distinguishing whether a tag is located inside another tag without a full sequential parse. When a full sequential parse hasn't been performed, the H2 element in the following button's onclick attribute erroneously terminates the current paragraph, and is also returned by tag search methods. See parsing rule 2(i) in the documentation of the Tag class for an explanation.

This anchor element demonstrates that a tag ending in /> is not considered an empty element tag if it has a name that requires an end tag. In this case the final '/' is included in the href attribute value instead of being interpreted as the end of the tag.

The same goes for tags that have an optional end tag like this paragraph, which has a grey background despite the fact that the p element is syntactically an empty element tag.

Microsoft Conditional Comments

This paragraph is inside a non-validating downlevel-revealed conditional comment which only appears in browsers other than IE because they ignore the invalid tags surrounding it.

This is an example of a validating downlevel-revealed conditional comment, which hides the invalid conditional tags inside HTML comments. This form must be used if the condition can be true in some IE browsers.

This is an example of a slightly simplified validating downlevel-revealed conditional comment that can be used only for the condition !IE (to display in any browser except IE).

This demonstrates the use of nested downlevel-revealed conditional comments.

Microsoft pseudo-HTML generated by Word

This section was generated by MS-Word and contains messy and invalid HTML.

 

Server Tag Examples:

This paragraph is ignored during a full sequential parse

'; ?> <%= $variable %> <%=var%> <%abc=def%> <% for (int i=0; i<10; i++) { document.write("This is indented server code"); } %> <%@ include file="relativeFragment.jsp" %>

This paragraph has a dynamic id attribute

These checkboxes have dynamic code determining whether they are checked: checked="checked"<% } %>/> />

The following is Mason server code sampled from the Mason book, chapter 2, section 3.4.9:

<& menu &> <&| /i18n/itext, lang => $lang &> %# The bits in here will be available from $m->content in the /i18/text Hello, <% $name %>. These words are in English. Bonjour, <% $name %>, ces mots sont franE<#xC3>E<#xA7>ais. Ellohay <% substr($name,2) . substr($name,0,1) . 'ay' %>, esethay ordsway areyay inyay Igpay Atinlay. <%def .make_a_link> <% $text %> <%args> $path %query => ( ) $text <%init> my $url = ... ... <*abc def="ghi"> This is an example of an element from a hypothetical server language whose tag formats have not been registered with the TagTypeRegister class