By default, the configuration fields, which you can edit as required, contain comma separated lists
of regular expressions representing HTML tags or attributes. Unless otherwise specified,
closing tags are assumed to be the same as the opening tag prefixed by a forward slash, as in
<tag> … </tag>, but tags which differ from this
pattern can be entered as opening and closing tags seperated by a tilde ~,
for example the comment tag would be entered as !--~--. Note that for tags
the enclosing angled brackets < and > will be added
automatically and should not be entered anywhere here.
The configuration fields that you can set are as follows …
First, there is a field controlling HTML tags to be excised entirely, that is to say the entire
HTML including and between the opening tag and the closing tag will be removed. By default,
this field contains regular expressions to remove non-compliant tags often inserted by Microsoft
products:
O\:[^>]+ and !-*\[IF[^\]]*\]~!\[ENDIF\]-*.
Next, there is a field controlling HTML tags to be removed, that is to say both opening and
closing tags will be removed, but intervening HTML will be left for further processing.
By default, this contains the following tags:
font and span.
Next, there is a field controlling HTML tags to be cleaned entirely by removing all attributes.
By default, this contains the following tags: html,
body, table, and tr.
Next, there is a field controlling HTML tags to be left unchanged, that is to say the single
tag or opening and closing tags will be normalised as to case and quoting but otherwise left
unchanged. By default, this contains the following tags:
!DOCTYPE, iframe, link,
meta, script, style, and comment.
Next, there is a field controlling tag pairs that may meaningfully be empty, and therefore will
not be removed if empty. By default, this contains the following tags:
applet, iframe, menuitem,
object, output, script, textarea,
td, th, and tr.
Next, there is field controlling tag attributes to be left unchanged. By default, this
contains the following attributes:
accesskey, class, dir,
id, lang, style, tabindex,
title, content, href, http-equiv,
name, on.*, src, target,
type, value, colspan, rowspan,
cols, rows, height, and width.
Next, there is a field controlling the indentation style of the HTML output, which works by
determining the given number of spaces or tabs to insert for each indent. A positive
integer inserts the given number of spaces, 0 gives no indentation, a negative
integer inserts a single tab which for the purposes of line-length calculations is taken to
represent the equivalent positive number of spaces. By default, this contains
-4, a tab representing 4 spaces.
Next, there is a field controlling the maximum length of a line before forcing line wrap in the
HTML output. By default, this contains 80
characters. As their name suggests, the inner HTML between opening and closing tags in
the leave unchanged list are not wrapped.
Next, there is field controlling whether tags may be linewrapped internally. By default,
this is set to false.
Next, there is a field controlling opening and closing tag pairs that will be linewrapped after
the closing tag. By default, this contains the following tags:
caption, heading h[1-6], label,
legend, option, and title.
Next, there is field controlling tag pairs that must not be linewrapped. By default, this
contains the following tags: a, abbr,
acronym, b, big, center,
del, em, font, i,
ins, q, s, samp,
small, span, strike, strong,
sub, sup, tt, and u.
Next, there is field controlling which EOL characters to use.
By default, this is set to:
Browser/OS.
Next, there is field controlling whether to display the cleaned HTML in an internal frame.
By default, this is set to true.
An editable comment can be added to the HTML source, or omitted altogether by blanking it.
It's default value is:
.