Universal syntax for Markdown

David Chambers david.chambers.05 at gmail.com
Sat Aug 13 18:43:57 EDT 2011

Great post, John. It's full of interesting thoughts so may take some time to digest.

You mentioned Babelmark, which caught my attention. I was not aware of its existence, and the discovery excited me greatly. A service of this nature will be invaluable for those who undertake the challenge of documenting the differences that exist between the many implementations.

The idea that translations could be retrieved via API calls is appealing. It would distribute the significant maintenance burden of keeping current the source of the various "Markdowns" and their respective dependencies.

Perhaps Babelmark could switch to making API calls to obtain translations as developers make endpoints available. What are your thoughts, Michel? I'd love to contribute to this effort if you consider it worthwhile.


On Aug 13, 2011, at 3:00 PM, John MacFarlane wrote:

> I'll chime in too. In developing pandoc, lunamark, and peg-markdown, I've

> thought a lot about markdown extensions, and also about how to resolve some of

> the ambiguities in the markdown syntax description.


> I agree that it would be good if the various implementations could

> converge as much as possible on these things. In some cases, they have

> converged: I think most implementations do footnotes the same way, for

> example, and as a result of a discussion on this list, PHP markdown extra

> and pandoc have the same syntax for fenced code blocks. In other cases, they

> haven't converged, even after discussion. Some of the divergences are pretty

> basic -- e.g. whether nested lists have to be indented by 4 spaces.


> Of course, going forward, implementors have to worry about backwards

> compatibility. I don't want to make changes that are going to break

> old pandoc documents. On the other hand, there's no reason why an

> implementation has to accept just one syntax for (say) definition

> lists. So if we could agree on a standard syntax that diverged from

> pandoc's old syntax, I might be able to modify pandoc to accept them

> both.


> I can think of two tools that would help a lot in discussions about

> extensions and edge cases:


> * An updated version of [babelmark](http://babelmark.bobtfish.net/),

> which allows you to compare the output of many implementations on the same

> input. This was really a useful tool in its time -- any chance it could be

> updated? The version of pandoc there, for example, is about three years

> old. [I know it's a burden to keep versions of so many implementations

> up to date. Perhaps each maintainer could set up a web app that returns

> output and metadata (version number, author, website) for a single

> implementation. There could be an API key so that only the main babelmark

> site could use these apps. babelmark could then just send out a bunch of

> HTTP queries and display the output. It wouldn't even need its own server.]


> * A wiki with a page for each syntax feature or extension, allowing

> comparison of different versions and discussion of pros and cons,

> with links to user's guides etc.


> * A test suite articulated into many very small tests, each testing

> something very specific. We could try to separate these into "agreed" and

> "disputed." There are many tests that extend the standard Markdown

> test suite that would be agreed on by virtually everyone.

> Michel Fortin's MDTest is a major step in this direction.

> I have a nice little test-runner that allows each test to be in

> a single file, with input and expected output. This makes it easy

> to add little tests.


> A few more thoughts:


> 1. Many people have mentioned "rule #1" - markdown should look natural and

> readable just by itself. I strongly agree. In my own tinkerings, I've also

> insisted on another principle, which Fletcher also articulated, but which I

> think would not be accepted by everyone on this list:


> Format-independence: Markdown is not just for writing HTML.


> I've seen people on this list say, "why do you need extension X, when you

> can just include raw HTML?" To which I reply: "Because I want to be

> able to convert my document to LaTeX, where the raw HTML won't do much good."

> It's true that John Gruber presented markdown primarily as a readable shortcut

> to HTML. But I don't see why we should keep thinking about it this way, when

> tools like pandoc and multimarkdown can easily convert markdown to a variety

> of formats. Indeed, one of the main reasons I write in markdown whenever I

> can is that I'm not tied to a single output format. I can have a canonical

> document that can be converted reliably to just about any text format.


> 2. We really need to clarify the rules for indented lists. As I've

> argued before on this list, the markdown documentation at least strongly

> implies that sublists need to be indented by four spaces, but many

> implementations (including Markdown.pl) don't insist on this.


> 3. I think most people agree that changing from ordered to bulleted

> list markers should start a new list (discussed earlier on this mailing

> list).


> 4. I think the opening number of an ordered list should be significant.


> 5. My own preference would be to require a blank line before a heading

> or blockquote, to avoid unexpected results.


> 5. Tables -- here there's a significant divergence between pandoc, PHP

> markdown extra, and multimarkdown. A limitation of pandoc's tables is

> that they require a monospaced font, since they rely on column

> alignment. The advantage is that they look exactly like tables. In

> addition, they allow table cells that contain whole paragraphs, and

> even arbitrary block-level content -- whereas, if I understand the

> documentation correctly, PHP markdown extra only allows simple tables

> with one-line cells. The philosophical differences here may be too

> deep for convergence.


> 6. Metadata -- multimarkdown's system is simple, flexible, and

> readable. One reservation I have about it is that it is

> English-centric -- nobody wants to write 'Title' at the beginning

> of a Swedish document -- but that could be solved by localization.

> It also seems a bit pedantic to have to say 'Title' if that's all you have.

> Pandoc's system is convenient and doesn't use English keywords, but it's not

> flexible enough, and I've been thinking about alternatives.


> 7. Image/link attributes -- the difficulty here is respecting

> format-independence. Saying that an image is 200px is not going

> to be helpful if you're targeting both HTML and LaTeX.


> 8. Citations -- I think multimarkdown's citation system is a step

> in the right direction, but too unambitious to make part of a standard.

> We put a lot of thought into a good markdown citation format on

> pandoc-discuss, and came up with this:

> http://johnmacfarlane.net/pandoc/README#citations

> This gives you automatic bibliographies and citations, with configurable

> styles -- you can even move between footnote styles and parenthesized

> inline references -- and still looks pretty natural.


> 9. Definition lists -- Pandoc is pretty similar to PHP Markdown Extra,

> but only supports one term per definition. HTML definition lists support

> multiple terms, but this doesn't make sense in many other output

> formats, and I don't think it's necessary.


> 10. Nesting/precedence -- this is probably less of a concern in

> practice, but there seems to be no standard for parsing nested

> inline elements. For example, consider the input

> '[hi `there] friend`](/url)'. Markdown.pl parses this as a link,

> and discount doesn't. I don't see anything in the Markdown syntax

> description that resolves the ambiguities here. Similarly for

> nested emph and strong -- Michel Fortin's MDTest suite contains

> some opinionated tests for these, but I'm not sure what the principle

> behind them is.


> John



> +++ Fletcher T. Penney [Aug 10 11 21:40 ]:

>> A few caveats:


>> 1) I am responding, at least in part, since I (or at least my software) was mentioned


>> 2) I've had a very nice meal, a nice relaxing evening in the mountains on vacation, and a few glasses of wine


>> 3) I can only speak for myself, not the authors of other Markdown derivatives/forks



>> I agree with some other points that have been made by others --- Gruber seems to be quite content with the current feature set and performance of Markdown, and not inclined to pursue it further. If he's happy, then I don't see any need for him to put further effort into development.


>> After being introduced to Markdown, it took me about 2 seconds to realize the beauty and elegance that it offered. It took me a little bit longer, but not that long, to realize that it had not been taken as far as it could go. To my knowledge, I was the first person to apply the idea of the Markdown syntax to an output format other than HTML. I then also tried to tie together the improvements made by Michel Fortin in terms of syntax additions. For me, MultiMarkdown offered the ultimate blend of syntax features and output format flexibility.


>> This is not the first time that the call has gone out for "one Markdown variant to rule them all" to be developed. I've even written, and then deleted, such a call myself. IMHO, the fatal flaw is that those of us capable and inclined to create a derivative of Markdown to scratch our own itch are happy with the variant we have created. We don't see a problem. We added what we needed, and we're content.


>> In the final analysis, it doesn't matter to me if the other authors of Markdown variants follow my syntax or not. They have their own goals, needs, and opinions that don't necessarily match mine. If you think that Markdown works best for you - great, stick with it. If MultiMarkdown offers features that you find useful, use it. If something else is better, by all means go with it.


>> That said, I am perfectly willing to tweak the syntax of MMD to mesh with some consensus if it were to exist. But, there is a limit to the features I would be interested in incorporating. I've been asked to include many syntax additions that I have said no to, because I thought they would end up detracting, rather than contributing to, the overall success of MMD. Some may agree with what I've done. Many others will disagree. That's fine.


>> Where I do think consensus would be helpful is in the features that are *almost* identical across implementations. Early on, I made changes to my footnote syntax to match what others were doing. There is value in such changes to improve compatibility across implementations. That said, I don't want to edge towards the "everything but the kitchen sink" mentality that plagues Word, for example. Gruber has made it pretty clear in the past that he is not a big fan of the syntax additions that I have made for MMD (though, strangely he seems supportive of PHP Markdown Extra.... ;)



>> My proposal, then, is to develop a "standards body" to create a core set of syntax additions, edge case resolution, and definitive test files to define "Markdown 2.0". Obviously, it would need a different name, but I am too lazy to think of one right now. My personal opinion is that this new standard would include fewer, rather than more, extensions to the core Markdown standard. I think it should be defined in a fairly rigorous manner, to avoid some of the ambiguity present in the canonical Markdown.pl (I think John MacFarlane's peg-markdown work was pretty good in this regard). I think some of the core features would include:


>> * metadata

>> * footnotes

>> * tables

>> * complete test cases/tools


>> secondary features could include:


>> * citations

>> * definition lists

>> * automatic cross-references/labels

>> * math extension

>> * image/link attributes



>> All this said, however, I think an important consideration for this discussion is:



>> What benefit do the authors of current Markdown variants gain from the effort required to agree on a standard?



>> Being realistic, I'm pretty busy with my day job. I'm even busier throwing in maintaining MMD and now trying to release a new application. I've put in countless hours on a project that has in total provided me with the equivalent of a weekend or two working at my day job in donations from the generosity from those who have themselves saved countless hours of their own time. Clearly I'm not doing this for the money. My guess is that other Markdown authors aren't doing it for the money either.


>> I think we all do it because we care. We see the beauty and utility in this approach to writing, whether it be for the web (Markdown) or other document formats (MultiMarkdown). For progress to be made on an official "next version" of Markdown, it's going to take a cause that offers some benefit to those of us who have worked so hard during the past few years to contribute our own changes and additions.



>> Again - my own $.02, and may not even be worth that much....


>> F-



>> --

>> Fletcher T. Penney

>> fletcher at fletcherpenney.net





>> _______________________________________________

>> Markdown-Discuss mailing list

>> Markdown-Discuss at six.pairlist.net

>> http://six.pairlist.net/mailman/listinfo/markdown-discuss

> _______________________________________________

> Markdown-Discuss mailing list

> Markdown-Discuss at six.pairlist.net

> http://six.pairlist.net/mailman/listinfo/markdown-discuss

More information about the Markdown-Discuss mailing list