evolving the spec (was: forking Markdown.pl?)
    John Fraser 
    john at attacklab.net
       
    Mon Mar  3 14:37:14 EST 2008
    
    
  
On Mar 3, 2008, at 7:30 AM, Michel Fortin wrote:
> Allan Odgaard wrote:
>> 4. A regexp which is pushed onto a stack when entering the context of
>> this rule, and popped again when leaving this rule.
>>
>> The fourth item here is really the interesting part, because it is
>> what made Markdown nesting work (99% of the time) despite this being
>> 100% rule-driven.
>
> I'm not sure that the regular expression in 4 does, beside being  
> pushed and popped from the stack (perhaps it's the end of block  
> expression), but overall it looks pretty good, and is pretty similar  
> to how I'm currently approaching the problem. There are a couple of  
> subtleties I'm not sure if these rules can catch though.
I assume Allan let the grammar refer back to this stack as if it were  
an ordinary rule, so you could use the stack to collect levels of  
indentation.  It's like a limited kind of parameterization.  I'd been  
planning to use recursive transformation to handle nesting, since it  
makes memoization easier and ought to be a little more readable.  But  
I'll try Allan's idea if mine gets hairy.
I like the direction you're both going, and I'm hoping we can come up  
with a definition that doesn't use any English at all.  Admittedly,  
that'll be a lot easier for a version that does change some behavior  
at the edges -- like ditching Markdown's 'undocumented *precedence'  
rules* (<http://six.pairlist.net/pipermail/markdown-discuss/2007-August/000746.html 
 >).
I'm going to build my own little prototype to experiment with this  
stuff (<http://six.pairlist.net/pipermail/markdown-discuss/2008-February/001042.html 
 >).  My goal is to come up with a formal grammar that doubles as a  
(slow) reference implementation.  You'll feed a grammar and an input  
file into a generic text-munging tool, which will spit out either the  
transformed output or an AST.  The tool will be small, easy to port,  
and completely general -- you could use it to implement html2txt or  
smartypants or an HTML sanitizer, for example.  That's the plan,  
anyway; we'll how the first iteration turns out.
> The way I see it, rules need to be parametrized so the above can be  
> changed without having to define 2^(number of syntax elements)  
> rules, such as EmphasisWithinLink, LinkWihtinEmphasis,  
> CodeSpanWithinLinkWithinEmphasis, and so on.
Since I'm doing something packrat-ish, I'm hoping I can use lookahead  
to keep the rules from exploding.
John Fraser
    
    
More information about the Markdown-Discuss
mailing list