Metadata syntax (was Universal syntax for Markdown)
jgm at berkeley.edu
Tue Sep 20 11:30:26 EDT 2011
+++ Tao Klerks [Sep 20 11 10:34 ]:
> On Tue, Sep 20, 2011 at 9:56 AM, John MacFarlane <jgm at berkeley.edu>
> I think that the abstract is a fine case. Although one *could*
> it the way you suggest, by having the metadata specify a section
> of the document to use as the abstract, I don't see the advantage of
> that. It is natural distinguish between the body text, which is
> *always* part
> of the produced document, whether a fragment or a standalone
> document is being
> produced, and regardless of the format or template used, and the
> which sometimes appear in the produced document, depending on one's
> and which appear differently in different formats. Once you make
> distinction, the abstract clearly falls on the side of the metadata.
> In that case, you're talking about metadata in the more general sense -
> like link definitions, footnotes, and other constructs that are
> currently treated as a special case in markdown. I'm all for having a
> special syntax for defining the abstract, as long as the author doesn't
> have to worry about any escaping conventions and can just write it like
> he/she would any other regular markdown content.
Yes, absolutely. There are two ways to approach this while keeping
'abstract' a metadata field:
(1) There could be a special syntax for designating metadata fields
as markdown (or alternatively markdown could be the default, and there
could be a special syntax for designating them plain strings).
I showed in my original post how lunamark implements this:
abstract = m[[
Here's the abstract. You can put anything you want
here, including blank lines. No special escaping is
needed. It can be flush left, but I've left a small
indent because it looks nice.
* item 1
* item 2
The 'm' indicates that the content is markdown. If you left it
out, you'd have a plain string.
(2) It could just be conventional that certain fields ('abstract',
'title', etc.) are interpreted as markdown.
> Other cases:
> * bibliographic data for the document itself, which you might want
> to print in some presentations but not others
> * revision history
> * tags
> * bibliography entries used in the document
> * settings for things like default stylesheets
> Point taken, most of these are good cases for supporting structured
> content, but not formattable/markdown content, right?
Right in most cases, but one might want a free-form revision history
that is just markdown, and bibliographic entries might include
> Currently you need to specify the bibliography database on the
> command line as well (it can be bibtex, endnote, or any number
> of other formats). Ideally, though, the document itself should
> specify where its bibliographical entries are coming from.
> This could just be a file path, but if you want the document to
> be truly portable, it would be nice to be able to include the
> bibliography entries themselves in metadata at the end of the
> This could be done easily with a data description language as
> powerful as lua/yaml/json.
> Absolutely - but the (possibly unattainable) ideal would be a situation
> where tools and experts can specify complex structured metadata, and
> regular joe can change his title, author, and other basic/simple values
> and lists, specifying values that contain apostrophes, commas and other
> natural punctuation, wihout blowing anything up in the process. As soon
> as he needs to specify/modify something that contains structure (or
> even something multi-line?) it seems fair that he should have to use a
> tool or do some research on the standard (esp. as most if not all of
> the structured-data use cases relate to tools already).
> My concern with a pure-lua/yaml/json metadata format is that it
> requires specialized knowledge (not related to the existing markdown
> standards/experience) on the part of the user for even the most trivial
> changes to the simplest fields - *especially* if structured/markdown
> content such as the abstract is placed in a metadata field!
I understand the concern. YAML is particularly bad this way, because you
get used to not quoting or escaping things, but then your document blows up
when you have a colon in the field. I think lua is a nice compromise--more
regular and predictable, but you don't have to quote the fields as in json,
and you have a really nice multiline string syntax that eliminates the need
for escaping.[^1] But my lua-based proposal is compatible with also having a
simpler way of specifying title, author, and date -- e.g. pandoc's, or Michael
Thompson's proposal involving centering, or MMD's (though I think the Hamlet
problem is serious).
[^1]: What if your abstract contains `]]`, you might ask?
Well, then you just need to use another delimiter for the multiline
string, such as `[=[` and `]=]`.
More information about the Markdown-Discuss