Mysterious MD5ification under very specific circumstances
wolfgangmcq at gmail.com
Fri Jun 28 13:17:51 EDT 2013
I happened across
http://www.cforcoding.com/2010/01/markdown-musings-on-unintended.html as I
was looking for information on why it was doing this. The comments are
> Fred Blasdel said...
> The PHP Markdown changelog should give you at least a hundred bugs in
Markdown.pl to test against — he started with a straight transliteration
(much like MarkdownSharp), and gradually made it less shitty.
> Gruber's design 'escapes' blocks by replacing them with their hashcodes,
but if the original input contains the same hashcodes — welcome to XSS city!
Lesson learned: Just because markdown.pl is the implementation listed on
the 'official' markdown page, and just because I can do `sudo apt-get
install markdown` and get it, does *not* mean that it's the best
implementation! I've switched to markdown_py (next one down on the list)
and everything is working fine now.
On Fri, Jun 28, 2013 at 12:44 PM, Fletcher Penney <
fletcher at fletcherpenney.net> wrote:
> By "markdown v.1.0.1" I'm guessing he meant Gruber's Perl Markdown.pl
> As for whom to report the bug, Gruber's Markdown.pl is presumably not
> going to be updated further, and has not been updated in years (also known
> as an eternity in internet time... ;). Certainly someone on this list may
> have an interest in finding the bug and posting a fix, but you may be
> better off switching to a variant of Markdown that is still undergoing
> active development. As Waylan hinted, there are lots, and I'm sure
> everyone on the list has their favorites. Heck, many people on this list
> have written their own (myself included).....
> When choosing a variant, some things to consider:
> * What languages (if any) are you comfortable with if you want to change
> anything? If you're not changing anything, this may not matter.
> * Are you using Markdown in a larger project where the choice of language
> will have a significant impact on ease of use?
> * How important is performance --- there can be several orders of
> magnitude difference between implementations?
> * Do you need extensions to the basic Markdown syntax (e.g. footnotes,
> tables, etc.)?
I appreciate the suggestions; frankly, I'm just looking for something that
will take basic markdown and HTMLify it so that I can make sure I got the
syntax right. I'm actually embedding python unit tests in a markdown
document via the doctest module, which is how I ran across this bug. I was
trying to comment out some initialization that didn't need to be shown in
the documentation, and suddenly my document went all funny.
My own implementation, for example is [MultiMarkdown](
> http://fletcherpenney.net/multimarkdown/). It is written in C and is
> designed to compile on pretty much anything. Once installed, you have a
> simple binary that is extremely fast and easy to use, and offers a few
> command line options. It's easily used in shell scripts, and most
> languages offer the equivalent of a "system()" call so you can use external
> utilities inside of Perl, ruby, etc. It offers a bunch of extra features
> that many believe were missing from the original Markdown, but you can turn
> those off with the compatibility mode to imitate the output from "standard
> markdown", minus most of the bugs. ;)
> On Jun 28, 2013, at 12:32 PM, Waylan Limberg <waylan at gmail.com> wrote:
> > Wolfgang,
> > Which implementation of the markdown parser are you using? Perl, php,
> > Ask to an explaination, some implementations of the parser use MD5
> Hashes as placeholders for the already parsed pieces of the document. My
> guess is that you found an edge case which tripps up the code that swaps
> out the placeholders for the parsed html.
> > Waylan Limberg
> > waylan at gmail.com
> > On Jun 28, 2013 11:55 AM, "Wolfgang Faust" <wolfgangmcq at gmail.com>
> > I was building a markdown document today when my document suddenly went
> blank. When I looked at the HTML source, I found that all my codeblocks had
> been MD5ified. The following is a minimal document which reproduces the
> > # Header #
> > <!-- There can be text here, but there doesn't have to be.
> > >> -->
> > This is a codeblock.
> > **Bold text** <!-- Another comment -->
> > In particular, there must be:
> > • A header
> > • A comment containing the sequence NEWLINE TAB followed by at
> least two greater-than signs
> > • At least one codeblock
> > • Bold text
> > • Another comment at the end of the document.
> > Changing even the smallest detail in the markdown results in a correct
> HTML document, as expected.
> > When I run this through markdown v.1.0.1, I get:
> > <h1>Header</h1>
> > <!--
> > 702c6078df02d6d43aa6003f415a0408
> > </blockquote>
> > 46815d21b36c42e3ef8dcf757dd5758a
> > **Bold text** <!-- Another comment -->
> > What on earth is going on here, and who do I report this bug to?
Thank you, Fletcher and Waylan, for your help!
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Markdown-Discuss