Universal syntax for Markdown

Seumas Mac Uilleachan seumas at idirect.ca
Sat Aug 13 22:20:53 EDT 2011


My $0.02

On 08/13/2011 06:00 PM, John MacFarlane wrote:

> I'll chime in too. In developing pandoc, lunamark, and peg-markdown, I've

> thought a lot about markdown extensions, and also about how to resolve some of

> the ambiguities in the markdown syntax description.

>

> I agree that it would be good if the various implementations could

> converge as much as possible on these things. In some cases, they have

> converged: I think most implementations do footnotes the same way, for

> example, and as a result of a discussion on this list, PHP markdown extra

> and pandoc have the same syntax for fenced code blocks. In other cases, they

> haven't converged, even after discussion. Some of the divergences are pretty

> basic -- e.g. whether nested lists have to be indented by 4 spaces.

>

> Of course, going forward, implementors have to worry about backwards

> compatibility. I don't want to make changes that are going to break

> old pandoc documents. On the other hand, there's no reason why an

> implementation has to accept just one syntax for (say) definition

> lists. So if we could agree on a standard syntax that diverged from

> pandoc's old syntax, I might be able to modify pandoc to accept them

> both.

>

> I can think of two tools that would help a lot in discussions about

> extensions and edge cases:

>

> * An updated version of [babelmark](http://babelmark.bobtfish.net/),

> which allows you to compare the output of many implementations on the same

> input. This was really a useful tool in its time -- any chance it could be

> updated? The version of pandoc there, for example, is about three years

> old. [I know it's a burden to keep versions of so many implementations

> up to date. Perhaps each maintainer could set up a web app that returns

> output and metadata (version number, author, website) for a single

> implementation. There could be an API key so that only the main babelmark

> site could use these apps. babelmark could then just send out a bunch of

> HTTP queries and display the output. It wouldn't even need its own server.]

>

> * A wiki with a page for each syntax feature or extension, allowing

> comparison of different versions and discussion of pros and cons,

> with links to user's guides etc.

>

> * A test suite articulated into many very small tests, each testing

> something very specific. We could try to separate these into "agreed" and

> "disputed." There are many tests that extend the standard Markdown

> test suite that would be agreed on by virtually everyone.

> Michel Fortin's MDTest is a major step in this direction.

> I have a nice little test-runner that allows each test to be in

> a single file, with input and expected output. This makes it easy

> to add little tests.

>

> A few more thoughts:

>

> 1. Many people have mentioned "rule #1" - markdown should look natural and

> readable just by itself. I strongly agree. In my own tinkerings, I've also

> insisted on another principle, which Fletcher also articulated, but which I

> think would not be accepted by everyone on this list:

Ditto, this should be the primary rule for Markdown.

>

> Format-independence: Markdown is not just for writing HTML.

>

> I've seen people on this list say, "why do you need extension X, when you

> can just include raw HTML?" To which I reply: "Because I want to be

> able to convert my document to LaTeX, where the raw HTML won't do much good."

> It's true that John Gruber presented markdown primarily as a readable shortcut

> to HTML. But I don't see why we should keep thinking about it this way, when

> tools like pandoc and multimarkdown can easily convert markdown to a variety

> of formats. Indeed, one of the main reasons I write in markdown whenever I

> can is that I'm not tied to a single output format. I can have a canonical

> document that can be converted reliably to just about any text format.

Markdown is for adding implied formatting to text-based documents.
Whatever the end-result conversion is does not matter. That is the
problem of the conversion program, not the text writer. When writing a
markdown-formatted text file I am not primarily worried abut what the
final output format is, I am worried about "this is a header", "this is
emphasized", "this is a list", etc. How it gets from A to B (or A to B1,
B2, etc) should not concern me as a user. Still, the primary focus needs
to be that I can read it as text and still get what formatting is
implied, even if I don't really know Markdown. That of course is the
hard part of all this. Everyone has their own interpretations on the
markup. Then again, if converting to Latex and I am using html markup I
would expect the program to be able to convert the html as well as the
markdown markup, or at least have a 2-step conversion process (markdown
-> latex, html -> latex). Since it is always implied that html is a
given in markdown-formatted documents that should be expected.

>

> 2. We really need to clarify the rules for indented lists. As I've

> argued before on this list, the markdown documentation at least strongly

> implies that sublists need to be indented by four spaces, but many

> implementations (including Markdown.pl) don't insist on this.

I for one always indent by 4 spaces but it seems to be a given that as
long as the indent is more than the previous level you move to the next
level.

>

> 3. I think most people agree that changing from ordered to bulleted

> list markers should start a new list (discussed earlier on this mailing

> list).

>

> 4. I think the opening number of an ordered list should be significant.

This I agree with, the initial number should be significant.

>

> 5. My own preference would be to require a blank line before a heading

> or blockquote, to avoid unexpected results.

>

> 5. Tables -- here there's a significant divergence between pandoc, PHP

> markdown extra, and multimarkdown. A limitation of pandoc's tables is

> that they require a monospaced font, since they rely on column

> alignment. The advantage is that they look exactly like tables. In

> addition, they allow table cells that contain whole paragraphs, and

> even arbitrary block-level content -- whereas, if I understand the

> documentation correctly, PHP markdown extra only allows simple tables

> with one-line cells. The philosophical differences here may be too

> deep for convergence.

Personally I hate monospaced fonts so anything that relies on them for
table alignment is not going to please me. Monospaced fonts are just too
tiring on the eyes for me to read. My tables would be completely out of
whack in my documents as text. I also tend to think that anything beyond
simple tables should not be a concern for markdown and leave them to
html (or whatever).

>

> 6. Metadata -- multimarkdown's system is simple, flexible, and

> readable. One reservation I have about it is that it is

> English-centric -- nobody wants to write 'Title' at the beginning

> of a Swedish document -- but that could be solved by localization.

> It also seems a bit pedantic to have to say 'Title' if that's all you have.

> Pandoc's system is convenient and doesn't use English keywords, but it's not

> flexible enough, and I've been thinking about alternatives.

What about a file that defines the metadata that is read by the
converter program? ie "Title" -> Title or whatever

Of course, when markdown was originally conceived the whole concept of
metadata was considered extraneous to its requirements.

>

> 7. Image/link attributes -- the difficulty here is respecting

> format-independence. Saying that an image is 200px is not going

> to be helpful if you're targeting both HTML and LaTeX.

>

> 8. Citations -- I think multimarkdown's citation system is a step

> in the right direction, but too unambitious to make part of a standard.

> We put a lot of thought into a good markdown citation format on

> pandoc-discuss, and came up with this:

> http://johnmacfarlane.net/pandoc/README#citations

> This gives you automatic bibliographies and citations, with configurable

> styles -- you can even move between footnote styles and parenthesized

> inline references -- and still looks pretty natural.

>

> 9. Definition lists -- Pandoc is pretty similar to PHP Markdown Extra,

> but only supports one term per definition. HTML definition lists support

> multiple terms, but this doesn't make sense in many other output

> formats, and I don't think it's necessary.

>

> 10. Nesting/precedence -- this is probably less of a concern in

> practice, but there seems to be no standard for parsing nested

> inline elements. For example, consider the input

> '[hi `there] friend`](/url)'. Markdown.pl parses this as a link,

> and discount doesn't. I don't see anything in the Markdown syntax

> description that resolves the ambiguities here. Similarly for

> nested emph and strong -- Michel Fortin's MDTest suite contains

> some opinionated tests for these, but I'm not sure what the principle

> behind them is.

>

> John

A lot of this is dependent on what you expect markdown to be for you. If
you are using it for blogs or other online comments then the basic
markdown is probably more than sufficient. If you are creating a
text-based documentation system (like I use it for - a personal
information repository) then markdown with certain extensions (like
tables and/or definitions) added is probably more than enough for you.
If you are creating documents for publishing or dissertations etc then
you need the above requirements and more and you are possibly moving
beyond the expectations of what markdown was created for and/or
intended. While I agree that markdown is the most elegant and easiest
markup system to just go ahead and write with it is not meant to do
anything and everything- - and I for one do not expect it to. If I was
writing something that needed citations and/or tables of contents and/or
indices and/or footnotes and/or etc etc I would not be using markdown.
There are other options that are far better suited for that (ie Docbook, or
reST). I would most likely choose reST over Docbook.

As a user of markdown I am quite happy with it as the basic (Gruber)
implementation plus a couple of extensions available through the version
I use (markdown.pl). I am not writing extensive documentation or the
next "Great American Novel" or a post-grad thesis. I like markdown for
the simplicity it offers, I simply have not found any other markup
system as easy to use and read/understand (and believe me I am not so
enamoured of markdown that I would not change in a heartbeat if one was
found). I rather suspect that if a major overhaul of markdown was
attempted we would see the userbase split in two - those who like
markdown as is versus thoes who want to see it expand into a
docbook-like publishing system. I think you can guess which camp I would
be in.

>

>

> +++ Fletcher T. Penney [Aug 10 11 21:40 ]:

>> A few caveats:

>>

>> 1) I am responding, at least in part, since I (or at least my software) was mentioned

>>

>> 2) I've had a very nice meal, a nice relaxing evening in the mountains on vacation, and a few glasses of wine

>>

>> 3) I can only speak for myself, not the authors of other Markdown derivatives/forks

>>

>>

>> I agree with some other points that have been made by others --- Gruber seems to be quite content with the current feature set and performance of Markdown, and not inclined to pursue it further. If he's happy, then I don't see any need for him to put further effort into development.

>>

>> After being introduced to Markdown, it took me about 2 seconds to realize the beauty and elegance that it offered. It took me a little bit longer, but not that long, to realize that it had not been taken as far as it could go. To my knowledge, I was the first person to apply the idea of the Markdown syntax to an output format other than HTML. I then also tried to tie together the improvements made by Michel Fortin in terms of syntax additions. For me, MultiMarkdown offered the ultimate blend of syntax features and output format flexibility.

>>

>> This is not the first time that the call has gone out for "one Markdown variant to rule them all" to be developed. I've even written, and then deleted, such a call myself. IMHO, the fatal flaw is that those of us capable and inclined to create a derivative of Markdown to scratch our own itch are happy with the variant we have created. We don't see a problem. We added what we needed, and we're content.

>>

>> In the final analysis, it doesn't matter to me if the other authors of Markdown variants follow my syntax or not. They have their own goals, needs, and opinions that don't necessarily match mine. If you think that Markdown works best for you - great, stick with it. If MultiMarkdown offers features that you find useful, use it. If something else is better, by all means go with it.

>>

>> That said, I am perfectly willing to tweak the syntax of MMD to mesh with some consensus if it were to exist. But, there is a limit to the features I would be interested in incorporating. I've been asked to include many syntax additions that I have said no to, because I thought they would end up detracting, rather than contributing to, the overall success of MMD. Some may agree with what I've done. Many others will disagree. That's fine.

>>

>> Where I do think consensus would be helpful is in the features that are *almost* identical across implementations. Early on, I made changes to my footnote syntax to match what others were doing. There is value in such changes to improve compatibility across implementations. That said, I don't want to edge towards the "everything but the kitchen sink" mentality that plagues Word, for example. Gruber has made it pretty clear in the past that he is not a big fan of the syntax additions that I have made for MMD (though, strangely he seems supportive of PHP Markdown Extra.... ;)

>>

>>

>> My proposal, then, is to develop a "standards body" to create a core set of syntax additions, edge case resolution, and definitive test files to define "Markdown 2.0". Obviously, it would need a different name, but I am too lazy to think of one right now. My personal opinion is that this new standard would include fewer, rather than more, extensions to the core Markdown standard. I think it should be defined in a fairly rigorous manner, to avoid some of the ambiguity present in the canonical Markdown.pl (I think John MacFarlane's peg-markdown work was pretty good in this regard). I think some of the core features would include:

>>

>> * metadata

>> * footnotes

>> * tables

>> * complete test cases/tools

>>

>> secondary features could include:

>>

>> * citations

>> * definition lists

>> * automatic cross-references/labels

>> * math extension

>> * image/link attributes

>>

>>

>> All this said, however, I think an important consideration for this discussion is:

>>

>>

>> What benefit do the authors of current Markdown variants gain from the effort required to agree on a standard?

>>

>>

>> Being realistic, I'm pretty busy with my day job. I'm even busier throwing in maintaining MMD and now trying to release a new application. I've put in countless hours on a project that has in total provided me with the equivalent of a weekend or two working at my day job in donations from the generosity from those who have themselves saved countless hours of their own time. Clearly I'm not doing this for the money. My guess is that other Markdown authors aren't doing it for the money either.

>>

>> I think we all do it because we care. We see the beauty and utility in this approach to writing, whether it be for the web (Markdown) or other document formats (MultiMarkdown). For progress to be made on an official "next version" of Markdown, it's going to take a cause that offers some benefit to those of us who have worked so hard during the past few years to contribute our own changes and additions.

>>

>>

>> Again - my own $.02, and may not even be worth that much....

>>

>> F-

>>

>>

>> --

>> Fletcher T. Penney

>> fletcher at fletcherpenney.net

>>

>>

>>

>>

>> _______________________________________________

>> Markdown-Discuss mailing list

>> Markdown-Discuss at six.pairlist.net

>> http://six.pairlist.net/mailman/listinfo/markdown-discuss

> _______________________________________________

> Markdown-Discuss mailing list

> Markdown-Discuss at six.pairlist.net

> http://six.pairlist.net/mailman/listinfo/markdown-discuss

>




More information about the Markdown-Discuss mailing list