Benchmarks with TextMate's manual
    Michel Fortin 
    michel.fortin at michelf.com
       
    Tue Aug 28 12:47:56 EDT 2007
    
    
  
Here's a little followup on what I wrote earlier with a few more  
details and some good news for PHP Markdown.
First, there has been an error in my previous benchmarks. Merging all  
the documents together creates a 176 Kb file, not 352 Kb as  
previously mentioned. I'm not sure how this happend, but it seems I  
performed the tests on this oversized file which contained the manual  
twice. Performing the tests again on the right file, I get this:
     PHP Markdown 1.0.1g: 12 seconds
     Markdown.pl 1.0.1:   17 seconds
     (iBook G4 1.2 Ghz)
which is still much slower than parsing each file separately, but is  
nevertheless better than the previous results (obviously, since the  
file is smaller).
Now, here is what I found about parsing.
At the core of PHP Markdown's speed problem is the "unhash" method,  
which by using PHP's str_replace function with an array of all the  
hashed values replaces all the hashed content it can find. This array  
of hashed values grows more or less linearly with the content size,  
and looping through each of these values for each paragraph makes the  
parser O(n^2).
Now, one thing of interest is the result for the latest release of  
PHP Markdown (1.0.1h), still for the same 176 Kb file as above:
     PHP Markdown 1.0.1h: 66 seconds
Ouch! Not much has changed between 1.0.1g and 1.0.1h, but something  
clearly isn't right. Version 1.0.1h is calling "unhash" much more  
than its predecessor, resulting in much worse performance, especially  
noticeable with big files.
With "unhash" fixed now (using a regular expression!) and with some  
other speed improvements, I can announce that the next version of PHP  
Markdown will parse the one-document TextMate manual in about 1.5  
seconds. This is now 0.5 second faster than parsing each of the  
documents separately.
I think I've also reached O(n) with PHP Markdown, at least in the  
general case. This is supported by parsing the big 352 Kb document in  
about 3 seconds. Twice the size, twice the time.
Also, I've included the TextMate manual in my local installation of  
the MDTest testsuite so I don't end up releasing a version of PHP  
Markdown that doesn't scale well in the future.
Michel Fortin
michel.fortin at michelf.com
http://www.michelf.com/
    
    
More information about the Markdown-Discuss
mailing list