using markdown in a forum?

Louis-David Mitterrand vindex+lists-markdown-discuss at apartia.org
Thu May 6 07:24:38 EDT 2010


On Wed, May 05, 2010 at 07:49:31PM +0200, Aristotle Pagaltzis wrote:

> * Louis-David Mitterrand <vindex+lists-markdown-discuss at apartia.org> [2010-05-05 16:05]:

> > What would be a "reasonable defaults" whitelist for html tags

> > in a forum context?

>

> All the tags Markdown has syntax for:

>

> em strong a img code br

> p ul ol li blockquote pre h1 h2 h3 h4 h5 h6

>

> Plus a few very reasonable extras:

>

> i b cite del ins

> dl dd dt

>

> Attributes that should be allowed:

>

> a: href title

> img: src alt title

> ol: start

> blockquote: cite

>

> That's the minimal reasonable set, I think.

>

> You may or may not want to also whitelist the table-related tags:

>

> table tr td th

> tbody tfoot thead caption

>

> Most of their possible attributes should be allowed in that case.

>

> For those, you'll need to tidy the HTML, not just scrub it, else

> people will be able to break your layout in malicious ways.

>

> You ***DON'T*** want to whitelist the `style` attribute under any

> circumstances, unless you also have a very very very careful CSS

> scrubber, because otherwise it's possible to inject Javascript

> that way.

>

> You'll also want to validate `a at href` values to keep people from

> putting `javascript:` URIs or similar foolishness in there. If in

> doubt, allow too little.

>


Thank you Aristotle for the detailed and informative answer. Very useful
indeed.

Fortunately HTML::Scrubber allows denying specific attributes based on a
regexp:

'href' => qr{^(?!(?:java)?script)}i,
'src' => qr{^(?!(?:java)?script)}i,
etc.


--
http://www.cruisefish.net


More information about the Markdown-Discuss mailing list