[GRASS-dev] man pages in UTF-8
glynn at gclements.plus.com
Wed Mar 7 17:01:52 EST 2012
Markus Neteler wrote:
> for Fedora and other distros UTF-8 encoding of manual pages is required.
> How about changing all HTML files to UTF-8 (I can do that)?
> Any side effects to be expected?
It would be better to change the HTML source files to use entities
rather than any particular encoding. Currently, they use a mix of
ISO-8859-1 and UTF-8 (those which use UTF-8 won't show correctly,
because the files are treated as being in ISO-8859-1). In 7.0, the
only <module>.html files containing non-ASCII characters are:
No modules appear to use non-ASCII characters in their
--html-description output (at least, not for the "C" locale).
mkhtml.py just copies the bytes verbatim, but it adds a "meta" tag to
the output indicating that the data is in ISO-8859-1:
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
g.html2man will need an option to select the output encoding (UTF-8 is
a GNU groff extension), and will need to convert the output to that
encoding; for UTF-8, it needs to add a byte order mark to file so that
preconv recognises it as UTF-8.
The changes will be simpler if the input is in ASCII or ISO-8859-1.
They will be more complex if HTML files are allowed to use characters
outside of the Latin-1 repertoire (currently, this only affects
i.atcorr, which uses "λ", which ends up as "λ" in the
Glynn Clements <glynn at gclements.plus.com>
More information about the grass-dev