Javascript in URLs (was: Markdown doesn't always generate XHTML)
    Michel Fortin 
    michel.fortin at michelf.com
       
    Sat Mar 15 08:17:48 EDT 2008
    
    
  
Le 2008-03-15 à 0:39, Waylan Limberg a écrit :
> On Fri, Mar 14, 2008 at 11:22 PM, Michel Fortin
>
>> PHP Markdown also has a no-markup mode which would filter script tags
>> and any other HTML tags. But this doesn't prevent anyone from
>> inserting their own script on the page. Do you know you can inject a
>> script in a URL? Guess what this does:
>>
>>     [link](javascript:alert%28'Hello%20world!'%29)
>
> This is a good point, and something I hadn't thought about myself. I
> would think that markdown should *not* allow that regardless of any
> safe/no-markup/whatever-you-call-it mode. If someone legitimately
> wants javascript in their links/images/etc then they should be writing
> raw html. What do you think?
Well if you want your "safe" mode to be really safe, then sure you  
should not allow `javascript:` URIs indeed.
But in general I believe Markdown should work with any URI. Markdown  
is a mean of writing web documents of all kinds, not only content from  
external untrusted sources, and there are many legitimate reasons one  
would want to write a `javascript:` URI.
Why would you want a "non-safe" Markdown to disallow such URIs in its  
link syntax if we're going to be able to add them using HTML tags  
anyway?
> Of course, then how do we do that? Some possabilites I came up with
> without much thought:
>
> 1. Trunicate a url at "javascript:"
> 2. Completely remove the entire url (perhaps replace with blank  
> string or "#")
> 3. Leave the markup for the entire link as plan text (in other words -
> its not considered a match)
> 4. Do some kind of escaping (not sure what at this point) and leave it
> in the url
Whatever you do, you first have to detect script URIs, all of them;  
this is no trivial matters. Most of these will run a script in IE or  
some other browser (based on the [XSS cheat sheet][1]):
     [link](vbscript:msgbox%28%22Hello%20world!%22%29)
     [link](livescript:alert%28'Hello%20world!'%29)
     [link](mocha:[code])
     [link](jAvAsCrIpT:alert%28'Hello%20world!'%29)
     [link](ja vas cr ipt:alert%28'Hello%20world!'%29)
     [link](ja vas cr ipt:alert%28'Hello%20world!'%29)
     [link](ja vas cr ipt:alert%28'Hello%20world!'%29)
     [link](ja%09 %0Avas cr
ipt:alert%28'Hello 
%20world!'%29)
     [link](ja%20vas%20cr%20ipt:alert%28'Hello%20world!'%29)
     [link](live%20script:alert%28'Hello%20world!'%29)
I can't claim this is an exhaustive list, nor that they're all going  
to work, but it should give an idea of the problem at hand.
I think blacklisting known dangerous schemes is always going to leave  
holes. A better approach is to have a white list of known "safe" URI  
schemes and disallow any scheme not in that list. But would be utterly  
restrictive for any "non-safe" Markdown.
Security filters already exist to do that (like kses); I'd say it's  
much simpler *and* safer to use such a specialized filter on  
Markdown's output than trying to come with our own integrated within  
Markdown.
  [1]: http://ha.ckers.org/xss.html
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/
    
    
More information about the Markdown-Discuss
mailing list