converting html with \xa9 to Markdown and using iconv?
jgm at berkeley.edu
Thu Mar 22 19:00:14 EDT 2007
You could try html2markdown, which uses iconv, tidy, and pandoc.
It should have no trouble with these characters. It's included in the
pandoc distribution: http://sophos.berkeley.edu/macfarlane/pandoc/
+++ Jeremy C. Reed [Mar 22 07 15:52 ]:
> The html document various characters like
> © \xa9 (Copyright symbol)
> (and others).
> I tried using html2text.py but it didn't like these characters.
> Any ideas on how I can use iconv or another tool to convert documents like
> this so I can then convert to Markdown?
> I don't want to do manually as I have around 500+ documents.
> Jeremy C. Reed
> Markdown-Discuss mailing list
> Markdown-Discuss at six.pairlist.net
More information about the Markdown-Discuss