Moving Markdown towards a standard syntax

John MacFarlane jgm at berkeley.edu
Fri Aug 15 01:10:06 EDT 2014


Waylan,

Thanks for your comments!  I'm glad it looks generally useful.

>" Tabs in lines are expanded to spaces, with a tab stop of 4 characters." We
>strictly enforce this rule in Python-Markdown and I like it, but we have
>received bug reports from time to time that certain languages require tabs
>(spaces would be a syntax error). I think makefiles would be the most
>well-known example. For example, how would you expect someone to be able to
>copy from a code block in a blog post and paste into a makefile without
>needing to go back and edit all the whitespace? If they are copy and
>pasting, they are not likely to be an advanced user and significant
>whitespace it already one of the most non-obvious gotchas. Just an
>observation here. The answer isn't clear to me either.

Yes, this is something that has always bothered me about the markdown
rules.  In pandoc I have an option to leave tabs intact, and the parser
knows how to handle tabs, so it can be done.  But it would certainly
complicate the spec.

>I notice that you state that an HTML block ends with a blank line. I have
>always wished that it worked that way for the very reasons you cited. As you
>observe, things like  raw `<pre>` blocks can't have blank lines (might want
>to add comments, processing instructions and CDATA to that list along with
>workarounds??). Either way is a compromise and it is not clear to me which
>is the better way to go.

Agreed.  There are pros and cons either way.  I could be persuaded to
go with something more standard (looking for matching closing tags),
but this approach has an appealing simplicity and flexibility.

>" A blank line always separates block quotes" Brilliant!
>
>I absolutely love what you did with how much indentation indicates nesting
>within a list (for all block elements, not just nested lists). However, I
>expect most people will have trouble getting it right in practice. And I
>wouldn't want to write a parser for that. But I sure would enjoy writing
>lists with it.

I think the way the rules for lists are written is currently pretty hard
to understand, but this is a writing issue that can be improved.  For
most people, the kind of informal presentation given in John Gruber's
syntax document should be enough to get them up and running.  I designed
these rules so that lists that look natural should be normally parsed in
the way their authors expect, so authors shouldn't need to think too
hard about the rules.

As for writing a parser:  I believe the algorithm used in my javascript
implementation could be easily ported over to other dynamic languages.

>" Two blank lines will end a list." Really? What about a code block nested
>within a list item that contains multiple blank lines? If it wasn't for that
>corner case, I would love this two. Or it there an exception for that?
>Example 198 seems to indicate so, but I don't see it explicitly stated
>anywhere. Is it for fenced code blocks only (because you can look for the
>closing fence -- if so, makes sense to me) or does it work with indented
>code blocks also?

This is something that needs clarification, thanks.  The C
implementation allows multiple blank lines in fenced code blocks, and
the js implementation should too (but currently doesn't).  But I see
there's nothing about it in the spec.

It would probably not be good to make indented code blocks behave the
same way, since one of the reasons for the two-blanks rule is to deal
with cases involving indented code.

I'd also be open to the idea of dropping the two-blanks rule, which
adds additional complexity.  But it is something that seems to come up
a lot, and without it you sometimes need artifices like HTML comments
to split up lists or separate lists from indented code blocks.

>" Changing the bullet or ordered list delimiter starts a new list." As it
>should. Also like the start number being set on ordered lists.
>
>" A backslash at the end of the line is a hard line break." Brilliant! I see
>you preserved the 'two spaces' rule. You changed some other things (like
>list nesting) enough that backward compatibility with existing docs
>shouldn't be a concern. Therefore, I'm not sure we need both.

Backwards compatibility was actually a big concern for me.  With list
nesting, it is impossible to be fully backwards compatible with every
existing implementation, because they are incompatible with each other.
But the rules I've given should make most normal looking lists (that is,
lists that aren't TRYING to break things) work in a large range of
different implementations.

For that reason, I favor keeping the two-spaces line break, which is
also nice in documents that you want to look pretty in plain text (a
big goal of markdown).

>Every implementation should follow your strong/emphasis spec. All
>implementers go change your implementations now... oh wait, that means more
>work for me...
>
>If I understand you correctly, all autolinks must be surrounded by angle
>brackets (the right call btw). Perhaps you should include a url **without**
>angle brackets in your list of "not autolinks" to make that clear.

Yes, good idea.

Best,
John


More information about the Markdown-Discuss mailing list