Setlocale and wchar_t C functions...

Steve Lime Steve.Lime at DNR.STATE.MN.US
Tue Jun 26 01:22:57 EDT 2007


Dan: I committed a bit of code tonite. It's in two places:

  - in mapprimitive.c, msPolylineLabelPath()

I use mbstowcs(NULL, string, 0); to get the number of characters in the label string. That function respects multibyte characters. At this point all we need is the number of characters.

  - in mapgd.c, msDrawTextLineGD()

I convert the string to a wide string and then step through those characters by converting back to set of bytes that GD can work with. It seems smart enough to deal with multibyte UTF-8 characters, we just have to give them to GD a character at a time.

It's straight forward, but needs a locale set to work. The link Umberto shared seems to say that C programs should do a setlocale(LC_CTYPE, ""); to derive the locale from environment variables.

Please take a look and let me know your thoughts. I don't see how iconv helps once you have the UTF-8 string but I may be missing something obvious. Note that I did move conversion to UTF-8 from the ENCODING in the labelObj (if supplied) to msDrawShape so that label placement is based on the UTF-8 string rather than the ascii version. 

Steve

>>> Daniel Morissette <dmorissette at MAPGEARS.COM> 06/22/07 3:01 PM >>>
Um... based on what I've read in the past I think setlocale() is a bad idea.

When I last looked at bug 1921 I thought we could deal with the issue 
using libiconv only, since, if I remember correctly, the iconv() 
function processes one character at a time so I thought we could use it 
to iterate over all mbyte characters in the string. I could try to find 
some time to have another look (early next week?) before we go the scary 
route of setting global locales and stuff like that.

Daniel



Steve Lime wrote:
> Hi all: I'm working to resolve bug 1921 with deals with curved labels and multibyte character sets. The current label path
> code doesn't handle multibyte character sets. Fortunately standard C (on Linux and MacOS anyway) have some functions
> to deal with the problem. For example mbstowcs converts from a multibyte string to a wide character string. The behavior
> of those functions depend on the locale. Unless you are in a local with multibyte characters those functions don't work
> with multibyte chars and assume single byte chars. Found that out trying to draw Chinese characters with the default US-EN 
> locale. If I set the local to one with multibyte characters then I could debug.
> 
> A couple of questions then:
> 
>   - what would folks think about adding a LOCALE parameter to the main mapObj? We could to a setlocale(LC_ALL, msyytext)
> immediately. If not set, then whatever the system has set takes over.
> 
>   - as I understand it setlocale has a global effect so perhaps there might be some down stream side effects, although I can't
> think what they might be. I suppose for long running processes there could be issues. I can see where a company might want
> to provide maps in different languages from the same server.
> 
>   - are the wide character functions (stdlib.h) available on all platforms?
> 
> Steve 


-- 
Daniel Morissette
http://www.mapgears.com/



More information about the mapserver-dev mailing list