[GRASS5] Re: Freetype failure
roger at spinn.net
roger at spinn.net
Mon Mar 27 17:34:11 EST 2006
Thanks for getting back to me, Glynn:
>> Also, the approach in the code to be rather indirect. It works by
>> converting UTF-8 codes to UCS-2BE codes and then converting the UCS-2BE
>> codes to FT_ULong for the freetype library.
>
> To be precise, it converts strings from the selected encoding (which
> may be UTF-8 or something else) to UCS2-BE.
Sorry, this is something I overlooked.
>> Not all UTF-8 codes can be represented in UCS-2BE.
>
> But does FreeType support anything beyond the 16-bit range?
I believe FreeType does. At any rate, FreeType uses a 32 bit value to store
it's character codes, so support is possible. Whether or not
FreeType-compatible fonts are available for the whole range is another
question.
>> I wrote a UTF-8 to FT_ULong converter to get a more direct solution and
>> eliminated convert_str from the code. This is a working solution and
>> probably in most respects a better solution than the current text3.c.
>
> Except for the most important issue, namely that the input string is
> not necessarily in UTF-8; the encoding is specified by the charset=
> option to d.font.freetype. As the FreeType support in the display
> drivers was originally written to support Japanese, I suspect that
> most of the existing users of this functionality probably won't be
> using UTF-8.
UTF-8 represents the entire range of UCS. Existing Japanese, Korean,
Chinese (etc.) character encodings are encorporated in UCS and are
represented by UTF-8. That does not mean that everyone's software is
delivering UTF-8 encoding, but the time when that happens is probably not
too far off.
> Whilst a hard-coded UTF-8 to UCS-2 or UCS-4 decoder might be a useful
> fall-back for systems which don't have iconv, the iconv code needs to
> stay to support other encodings.
That makes sense, but if everything is to be funneled into one encoding then
I don't think it should be through UCS-2. There is the possibly academic
fact that UCS-2 doesn't represent all of UCS. Also, UTF-8 is expected to be
the future standard encoding and many of us are already working with it.
UTF-8 has been the default encoding in all major Linux distributions for a
couple years now -- longer for some distros. I haven't heard that UCS-2 is
that widely used.
It makes more sense to translate anything that isn't already encoded in
UTF-8 into UTF-8, then decode UTF-8 to FreeType. That way UTF-8 systems
would not have to go through an encode-decode cycle.
Roger Miller
More information about the grass-dev
mailing list