Optional features (was: Markdown Extra Specification (First Draft))

Yuri Takhteyev qaramazov at gmail.com
Sat May 24 15:34:04 EDT 2008



> It seems to me that filtering is a red herring in your case. If

> you want to allow users to enter literal tags, you will have this

> problem whether you filter the ultimate output or not.


If I want to allow them, then yes, but this is not the case I was
considering. Suppose I do _not_ want to allow them to enter HTML
tags. This is easy to implement as an option in a Markdown converter.
However, if the converter doesn't do that, then I have a much harder
task: user's tags are now mixed with Markdown's tags, and I have to
figure out how to sort them out. There _is_ a difference between the
<em> inserted by markdown and the <em> inserted by the user. I know
Markdown's em will be balanced. I am not sure that the user's will
be. At this point the only way to be sure that the HTML is valid is
to parse it.


> If your XHTML parser has a streaming input mode, you can couple

> your Markdown converter directly to the XHTML parser and feed the

> HTML output to it as you go. If the XHTML parser throws a well-

> formedness error, you can then relate it to the vicinity of the

> last Markdown chunk you converted to HTML and passed into the

> XHTML parser.


I am not quite sure what you mean, but Markdown documents can't always
be processed on a chunk by chunk basis. Consider:

Here is a [link][id].

... 100KB of text...

[id]: http://example.com/ "Optional Title Here"

This document cannot be processed correctly unless it's considered all
at the same time.


> If you don't want to couple the Markdown converter with an XHTML

> parser that closely, it's still possible to do this, but the

> Markdown converter will have to be able to accept streaming input

> itself and will need to generate output sufficiently frequently

> that you can track the correlation of input and output with a

> useful amount of precision.


Sure, if you want to drop support for references, footnotes, etc. But
it's much simpler to implement a "safe mode" that escapes or validates
all HTML submitted by the user.

- yuri

--
http://sputnik.freewisdom.org/


More information about the Markdown-Discuss mailing list