Allan Odgaard 1EDF4D33-D1B1-4C97-A393-3D2B4EE5E095+Markdown at
Sun May 2 08:23:28 EDT 2010

On 2 May 2010, at 14:01, Aristotle Pagaltzis wrote:

>> […] you want to filter out HTML tags […]

> […] And it’s not impossible to write a 100% solid filter if you use

> a *white*list applied to a real HTML parser.

Not sure what you mean by “real HTML parser”.

One thing to watch out for is improper HTML when users type a literal
‘<’. I had a lot of users lose part of their comments because
everything after a standalone ‘<’ was incorrectly filtered.

This was with WordPress + PHPMarkdown (blog comments). What made it
worse was that it was the filtered content which went into the
database, so once filtered, the content was gone.

