Mysterious MD5ification under very specific circumstances

Wolfgang Faust wolfgangmcq at gmail.com
Fri Jun 28 13:17:51 EDT 2013


I happened across
http://www.cforcoding.com/2010/01/markdown-musings-on-unintended.html as I
was looking for information on why it was doing this. The comments are
rather interesting:

> Fred Blasdel said...

> The PHP Markdown changelog should give you at least a hundred bugs in

Markdown.pl to test against — he started with a straight transliteration
(much like MarkdownSharp), and gradually made it less shitty.

> Gruber's design 'escapes' blocks by replacing them with their hashcodes,

but if the original input contains the same hashcodes — welcome to XSS city!

Lesson learned: Just because markdown.pl is the implementation listed on
the 'official' markdown page, and just because I can do `sudo apt-get
install markdown` and get it, does *not* mean that it's the best
implementation! I've switched to markdown_py (next one down on the list)
and everything is working fine now.

On Fri, Jun 28, 2013 at 12:44 PM, Fletcher Penney <
fletcher at fletcherpenney.net> wrote:


> By "markdown v.1.0.1" I'm guessing he meant Gruber's Perl Markdown.pl

> 1.0.1.

>

>

> As for whom to report the bug, Gruber's Markdown.pl is presumably not

> going to be updated further, and has not been updated in years (also known

> as an eternity in internet time... ;). Certainly someone on this list may

> have an interest in finding the bug and posting a fix, but you may be

> better off switching to a variant of Markdown that is still undergoing

> active development. As Waylan hinted, there are lots, and I'm sure

> everyone on the list has their favorites. Heck, many people on this list

> have written their own (myself included).....

>

> When choosing a variant, some things to consider:

>

> * What languages (if any) are you comfortable with if you want to change

> anything? If you're not changing anything, this may not matter.

>

> * Are you using Markdown in a larger project where the choice of language

> will have a significant impact on ease of use?

>

> * How important is performance --- there can be several orders of

> magnitude difference between implementations?

>

> * Do you need extensions to the basic Markdown syntax (e.g. footnotes,

> tables, etc.)?

>

I appreciate the suggestions; frankly, I'm just looking for something that
will take basic markdown and HTMLify it so that I can make sure I got the
syntax right. I'm actually embedding python unit tests in a markdown
document via the doctest module, which is how I ran across this bug. I was
trying to comment out some initialization that didn't need to be shown in
the documentation, and suddenly my document went all funny.

My own implementation, for example is [MultiMarkdown](

> http://fletcherpenney.net/multimarkdown/). It is written in C and is

> designed to compile on pretty much anything. Once installed, you have a

> simple binary that is extremely fast and easy to use, and offers a few

> command line options. It's easily used in shell scripts, and most

> languages offer the equivalent of a "system()" call so you can use external

> utilities inside of Perl, ruby, etc. It offers a bunch of extra features

> that many believe were missing from the original Markdown, but you can turn

> those off with the compatibility mode to imitate the output from "standard

> markdown", minus most of the bugs. ;)

>

>

> Fletcher

>

>

>

>

> On Jun 28, 2013, at 12:32 PM, Waylan Limberg <waylan at gmail.com> wrote:

>

> > Wolfgang,

> >

> > Which implementation of the markdown parser are you using? Perl, php,

> ruby, python, javascript, ... (and many more) and which version specificly?

> >

> > Ask to an explaination, some implementations of the parser use MD5

> Hashes as placeholders for the already parsed pieces of the document. My

> guess is that you found an edge case which tripps up the code that swaps

> out the placeholders for the parsed html.

> >

> > Waylan Limberg

> > waylan at gmail.com

> >

> > On Jun 28, 2013 11:55 AM, "Wolfgang Faust" <wolfgangmcq at gmail.com>

> wrote:

> > I was building a markdown document today when my document suddenly went

> blank. When I looked at the HTML source, I found that all my codeblocks had

> been MD5ified. The following is a minimal document which reproduces the

> error:

> >

> > # Header #

> > <!-- There can be text here, but there doesn't have to be.

> > >> -->

> >

> > This is a codeblock.

> >

> > **Bold text** <!-- Another comment -->

> >

> > In particular, there must be:

> > • A header

> > • A comment containing the sequence NEWLINE TAB followed by at

> least two greater-than signs

> > • At least one codeblock

> > • Bold text

> > • Another comment at the end of the document.

> > Changing even the smallest detail in the markdown results in a correct

> HTML document, as expected.

> >

> > When I run this through markdown v.1.0.1, I get:

> > <h1>Header</h1>

> >

> > <!--

> >

> >

> > 702c6078df02d6d43aa6003f415a0408

> >

> >

> > </blockquote>

> >

> >

> >

> > 46815d21b36c42e3ef8dcf757dd5758a

> >

> >

> >

> > **Bold text** <!-- Another comment -->

> >

> > What on earth is going on here, and who do I report this bug to?

>


Thank you, Fletcher and Waylan, for your help!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://six.pairlist.net/pipermail/markdown-discuss/attachments/20130628/be96aa35/attachment-0001.htm>


More information about the Markdown-Discuss mailing list