evolving the spec (was: forking Markdown.pl?)
    Michel Fortin 
    michel.fortin at michelf.com
       
    Tue Mar  4 23:02:05 EST 2008
    
    
  
Le 2008-03-04 à 0:49, Allan Odgaard a écrit :
> On 3 Mar 2008, at 13:30, Michel Fortin wrote:
>
>> [...]
>>> 1. A regexp that makes the parser enter the context the rule
>>> represents (e.g. block quote, list, raw, etc.).
>>>
>>> 2. A list of which rules are allowed in the context of this rule.
>>>
>>> 3. A regexp for leaving the context of this rule.
>>>
>>> 4. A regexp which is pushed onto a stack when entering the context  
>>> of
>>> this rule, and popped again when leaving this rule.
>>>
>>> The fourth item here is really the interesting part, because it is
>>> what made Markdown nesting work (99% of the time) despite this being
>>> 100% rule-driven.
>>
>> I'm not sure that the regular expression in 4 does, beside being  
>> pushed and popped from the stack
>
> Yeah, I accidentally sent the letter w/o noticing I forgot to  
> explain the fourth rule.
>
> [big explanation]
So you're basically using a line by line approach. I was thinking  
about that as a possibility for parsing blocks, but I don't think I'll  
do that because I need backtracking to be able to rewind beyond the  
current line. Or can you do it?
I'm particularly curious about how you can handle headers of this form:
     Header
     ======
> Now take the rule for block quote:
>
>    BQ[1] = /\g {,3}> {,3}/    # We start it for lines with > allowing
>                               # up to 3 spaces before/after.
>
>    BQ[2] = [ BQ, RAW, PAR, … ] # Basically all block elements
>                                # can go inside block quote.
>
>    BQ[3] = /\g( *$|«hr»)/     # We leave block quote at empty lines or
>                               # horizontal rulers¹. The actual  
> pattern for
>                               # «hr» is something like:
>                               #     [ ]{,3}(?<M>[-*_])([ ]{,2}\k<M>) 
> {2,}[ \t]*+$
>
>    BQ[4] = /\g( {,3}> ?)?/    # While in BQ eat leading quote  
> characters.
>
> ¹ I am actually not sure if this is “the spec” or just a bug. But  
> placing a horizontal ruler just below a block quoted paragraph does  
> not give the expected “lazy mode” and places the <hr> inside the  
> block quote, instead it leaves the block quote.
I'm not sure what's the problem with horizontal rules in blockquotes.  
I've tried many variations of:
     > test
     >
     > ***
     >
     > test
and couldn't make it end the blockquote prematurely. If it did, I'd  
say it'd be a bug because I see no way the user would expect the  
horizontal rule to break the blockquote and no reason for the parser  
to do so either.
> [...]
>
> Okay, enough writing — I hope the above gives a better understanding  
> of how the rules are used.
Indeed, it was quite insightful. Thank you.
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/
    
    
More information about the Markdown-Discuss
mailing list