evolving the spec (was: forking Markdown.pl?)

Michel Fortin michel.fortin at michelf.com
Tue Mar 4 23:02:05 EST 2008


Le 2008-03-04 à 0:49, Allan Odgaard a écrit :


> On 3 Mar 2008, at 13:30, Michel Fortin wrote:

>

>> [...]

>>> 1. A regexp that makes the parser enter the context the rule

>>> represents (e.g. block quote, list, raw, etc.).

>>>

>>> 2. A list of which rules are allowed in the context of this rule.

>>>

>>> 3. A regexp for leaving the context of this rule.

>>>

>>> 4. A regexp which is pushed onto a stack when entering the context

>>> of

>>> this rule, and popped again when leaving this rule.

>>>

>>> The fourth item here is really the interesting part, because it is

>>> what made Markdown nesting work (99% of the time) despite this being

>>> 100% rule-driven.

>>

>> I'm not sure that the regular expression in 4 does, beside being

>> pushed and popped from the stack

>

> Yeah, I accidentally sent the letter w/o noticing I forgot to

> explain the fourth rule.

>

> [big explanation]


So you're basically using a line by line approach. I was thinking
about that as a possibility for parsing blocks, but I don't think I'll
do that because I need backtracking to be able to rewind beyond the
current line. Or can you do it?

I'm particularly curious about how you can handle headers of this form:

Header
======


> Now take the rule for block quote:

>

> BQ[1] = /\g {,3}> {,3}/ # We start it for lines with > allowing

> # up to 3 spaces before/after.

>

> BQ[2] = [ BQ, RAW, PAR, … ] # Basically all block elements

> # can go inside block quote.

>

> BQ[3] = /\g( *$|«hr»)/ # We leave block quote at empty lines or

> # horizontal rulers¹. The actual

> pattern for

> # «hr» is something like:

> # [ ]{,3}(?<M>[-*_])([ ]{,2}\k<M>)

> {2,}[ \t]*+$

>

> BQ[4] = /\g( {,3}> ?)?/ # While in BQ eat leading quote

> characters.

>

> ¹ I am actually not sure if this is “the spec” or just a bug. But

> placing a horizontal ruler just below a block quoted paragraph does

> not give the expected “lazy mode” and places the <hr> inside the

> block quote, instead it leaves the block quote.


I'm not sure what's the problem with horizontal rules in blockquotes.
I've tried many variations of:

> test
>
> ***
>
> test

and couldn't make it end the blockquote prematurely. If it did, I'd
say it'd be a bug because I see no way the user would expect the
horizontal rule to break the blockquote and no reason for the parser
to do so either.


> [...]

>

> Okay, enough writing — I hope the above gives a better understanding

> of how the rules are used.


Indeed, it was quite insightful. Thank you.


Michel Fortin
michel.fortin at michelf.com
http://michelf.com/




More information about the Markdown-Discuss mailing list