Markdown Extra Spec: Parsing Section
    Michel Fortin 
    michel.fortin at michelf.com
       
    Sun May 11 22:26:33 EDT 2008
    
    
  
Le 2008-05-11 à 20:55, Jacob Rus a écrit :
> You should write it in something closer to a BNF-like format.  The  
> current version is about 10x more verbose than necessary, and it  
> makes reading the spec considerably more difficult.
The reason I'm doing it like this is that I doubt everything will be  
expressible in a BNF format. Using plain english descriptions allows  
me to not bother about fitting things to a specific grammar and just  
write what I feel is the most natural and the easier to understand.
Shopping for a more formal and less verbose grammar, if we need one,  
will be much easier once we know what we need, once we can compare  
existing grammars against a checklist of what is necessary to  
implement the given parsing algorithm.
If you remember the timetable I've given, you'll see that I've booked  
about half a year for polishing things out. This includes rephrasing  
sentences, refactorizing the syntax, and reformatting the spec to make  
it easier to understand. This *could* include switching to a new  
grammar format if it makes things more intuitive and readable.
> Also, you're still going to have quite a few sticky edge cases with  
> your current parsing model.  What happens when we have a `<>`- 
> delimited URL inside a blockquote?  For instance:
>
> > what about this <http://
> > google.com/> case?
Well, currently newlines aren't allowed inside automatic links in  
Markdown.pl, PHP Markdown and some others. Implementations who see an  
automatic link there sees it as a link to "http://  
google.com/" (notice the space) or "http://" (notice what's missing).
  <http://babelmark.bobtfish.net/?markdown=%0D%0A%3E+what+about+this+%3Chttp%3A%2F%2F%0D%0A%3E+google.com%2F%3E+case%3F&normalize=on&src=1&dest=2 
 >
Anyway, with the parsing model in three passes I'm currently defining  
it's pretty trivial to do correctly: the block elements pass extracts  
the text of the blockquote, leaving this to parse by the span element  
pass:
     what about this <http://
     google.com/> case?
The span element pass would then see an autolink and just ignore any  
newline it finds in the URL.
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/
    
    
More information about the Markdown-Discuss
mailing list