Detab should be multi-byte aware?

Michel Fortin michel.fortin at michelf.com
Mon Oct 9 18:52:18 EDT 2006


Le 9 oct. 2006 à 17:02, Allan Odgaard a écrit :


> As for #2, Markdown doesn’t know the encoding of the source

> document, so that would mean it can’t really be aware of things

> such as UTF-8 mb sequences, OTOH if it changes my pre-formatted

> text, I would like to have it do the right thing.


Currently, Markdown.pl and my own PHP implementation of Markdown both
support any superset of ASCII; that includes UTF-8. UTF-8 multi-byte
sequences have the interesting property of being entirely composed of
bytes above 127, over ASCII range. So while Markdown isn't really
"aware" of these multi-byte sequences in the sense that it treats
them as one character, it isn't changing them into anything either.

From your description of the problem, I believe you're not using UTF-8.


Michel Fortin
michel.fortin at michelf.com
http://www.michelf.com/




More information about the Markdown-Discuss mailing list