[GRASS5] Re: Freetype failure

Mon Mar 27 17:34:11 EST 2006

Thanks for getting back to me, Glynn: 

>> Also, the approach in the code to be rather indirect.  It works by 
>> converting UTF-8 codes to UCS-2BE codes and then converting the UCS-2BE 
>> codes to FT_ULong for the freetype library.
> 
> To be precise, it converts strings from the selected encoding (which
> may be UTF-8 or something else) to UCS2-BE.

Sorry, this is something I overlooked. 

>> Not all UTF-8 codes can be represented in UCS-2BE.
> 
> But does FreeType support anything beyond the 16-bit range?

I believe FreeType does.  At any rate, FreeType uses a 32 bit value to store 
it's character codes, so support is possible.  Whether or not 
FreeType-compatible fonts are available for the whole range is another 
question. 

>> I wrote a UTF-8 to FT_ULong converter to get a more direct solution and 
>> eliminated convert_str from the code.  This is a working solution and 
>> probably in most respects a better solution than the current text3.c. 
> 
> Except for the most important issue, namely that the input string is
> not necessarily in UTF-8; the encoding is specified by the charset=
> option to d.font.freetype. As the FreeType support in the display
> drivers was originally written to support Japanese, I suspect that
> most of the existing users of this functionality probably won't be
> using UTF-8.

UTF-8 represents the entire range of UCS.  Existing Japanese, Korean, 
Chinese (etc.) character encodings are encorporated in UCS and are 
represented by UTF-8.  That does not mean that everyone's software is 
delivering UTF-8 encoding, but the time when that happens is probably not 
too far off. 

> Whilst a hard-coded UTF-8 to UCS-2 or UCS-4 decoder might be a useful
> fall-back for systems which don't have iconv, the iconv code needs to
> stay to support other encodings.

That makes sense, but if everything is to be funneled into one encoding then 
I don't think it should be through UCS-2.  There is the possibly academic 
fact that UCS-2 doesn't represent all of UCS.  Also, UTF-8 is expected to be 
the future standard encoding and many of us are already working with it.  
UTF-8 has been the default encoding in all major Linux distributions for a 
couple years now -- longer for some distros.  I haven't heard that UCS-2 is 
that widely used. 

It makes more sense to translate anything that isn't already encoded in 
UTF-8 into UTF-8, then decode UTF-8 to FreeType.  That way UTF-8 systems 
would not have to go through an encode-decode cycle. 

Roger Miller