spaces and newlines before list markers (was: evolving the spec)

Thomas Nichols nichols7 at googlemail.com
Fri Feb 29 17:46:58 EST 2008


Joseph Lorenzo Hall wrote on 2008/02/29 17:14:

> As a slightly-OT aside, there's another view on this "spaces before a

> list item" issue that sees it as a bug.

>

> When I write a list of references in a academic paper, I do so with

> list items. I do a hanging indent where the rest of the reference is

> indented by two or three spaces, like so:

>

> * Aslam, J. A., Popa, R. A., & Rivest, R. L. (2007). On estimating the

> size and confidence of a statistical audit, USENIX/ACCURATE

> Electronic Voting Technology Workshop 2007. Retrieved February 24,

> 2008. from

> <http://www.usenix.org/events/evt07/tech/full_papers/aslam/aslam.pdf>.

>

> Markdown sees that " 2008." as a list item.

>


Ok: that's what Markdown.pl sees. Do any of _us_ see it like that?

For a human, I think this is easy. Even removing the comma from the
preceding line, I think there's enough 'ASCII layout' information here
for an untrained reader to tell from a casual glance that '2008' is not
intended to be a list item. So do we need a full-on AI engine to be able
to build a parser to handle this? Or can we think up some simple
algorithm - there must be a change of indent for a sub-bullet, perhaps,
or a new bullet must be at the same level as the preceding one?

If "2008" were indented by a space, then I think it **should** be a new
item, a sub-bullet. Maybe that's because I'm used to working with
suboptimal tools (web browsers for blog comments have already been
mentioned) which don't automatically indent the paragraph beneath the
bullet point as in this example. In fact, I often see a blank line
separating bullets instead.

----

* Inertial-Electrodynamic Fusion (IEF) Device - Energy/Matter
Conversion Corporation (EMC2). The fusion process recommended by Dr.
Bussard takes boron-11 and fuses a proton to it, producing, in its
excited state, a carbon-12 atom. This excited carbon-12 atom decays to
beryllium-8 and helium-4.

* Bussard's website, asking for donations to fund further research

* American Scientist article mentioning the founding of EMC2
----

Again, I think each of us understands these are bullets. I'm writing
this in Thunderbird, just typing '*' instead of pressing the 'bullet'
button', so I'm getting word-wrap but no automatic indentation (though
MTAs etc. may reformat the message later). They follow the typographic
approach of indenting the first line of the para; compare with the
example Joe quoted which outdents the bullet points and then indents the
whole paragraph beneath it. Can we consider these to be two equally
valid approaches?



On a personal note, what inspired me about Markdown was John Gruber's
[Dive into Markdown][] article. Possibly relevant here:
----
In fact, I love writing email. Email is my favorite writing medium. I’ve
sent over 16,000 emails in the last five years. The conventions of plain
text email allow me to express myself clearly and precisely, without
ever getting in my way.

Thus, Markdown. Email-style writing for the web.
...
The typographic constraints of plain text — a single typeface, in a
single size, with no true italics or bold — are very much similar to the
constraints of a typewriter.
----

That's what I'm after ... to be able to use "the conventions of plain
text email" when creating content; and conventions are often pretty
woolly, so creating a formal ruleset from them is probably going to be
tough. In this instance, though, I haven't yet understood why Markdown
should not continue to support the following from the [syntax page][],

"List markers typically start at the left margin, but may be indented by
up to three spaces. List markers must be followed by one or more spaces
or a tab."

As a (weak) analogy, SQL has both a laboriously detailed specification
and a surprisingly loose query syntax, allowing noise words and using
intelligent defaults to capture intent wherever possible. TIMTOWTDI,
especially when writing an email.

So -- any reasons why we should need to "tighten the spec"? Or can we
simply document it formally, with a grammar, test suite and so on to
make sure that the expected behaviour is always known, and ideally is
consistent with "email convention"?

-- Thomas.

[Dive into Markdown]: http://daringfireball.net/2004/03/dive_into_markdown
[syntax page]: http://daringfireball.net/projects/markdown/syntax#list







More information about the Markdown-Discuss mailing list