Setlocale and wchar_t C functions...

Daniel Morissette dmorissette at MAPGEARS.COM
Wed Jun 27 10:46:29 EDT 2007


Steve,

I have started working on something that would not involve the locale 
stuff, i.e. writing my own msGetUTF8Char() function to parse/extract the 
UTF-8 chars one at a time from the UTF-8 string. Could you please send 
me your test dataset/mapfile off-list so that I can try and see if my 
idea works?

Daniel


Steve Lime wrote:
> Dan: I committed a bit of code tonite. It's in two places:
> 
>   - in mapprimitive.c, msPolylineLabelPath()
> 
> I use mbstowcs(NULL, string, 0); to get the number of characters in the label string. That function respects multibyte characters. At this point all we need is the number of characters.
> 
>   - in mapgd.c, msDrawTextLineGD()
> 
> I convert the string to a wide string and then step through those characters by converting back to set of bytes that GD can work with. It seems smart enough to deal with multibyte UTF-8 characters, we just have to give them to GD a character at a time.
> 
> It's straight forward, but needs a locale set to work. The link Umberto shared seems to say that C programs should do a setlocale(LC_CTYPE, ""); to derive the locale from environment variables.
> 
> Please take a look and let me know your thoughts. I don't see how iconv helps once you have the UTF-8 string but I may be missing something obvious. Note that I did move conversion to UTF-8 from the ENCODING in the labelObj (if supplied) to msDrawShape so that label placement is based on the UTF-8 string rather than the ascii version. 
> 
> Steve
> 
>>>> Daniel Morissette <dmorissette at MAPGEARS.COM> 06/22/07 3:01 PM >>>
> Um... based on what I've read in the past I think setlocale() is a bad idea.
> 
> When I last looked at bug 1921 I thought we could deal with the issue 
> using libiconv only, since, if I remember correctly, the iconv() 
> function processes one character at a time so I thought we could use it 
> to iterate over all mbyte characters in the string. I could try to find 
> some time to have another look (early next week?) before we go the scary 
> route of setting global locales and stuff like that.
> 
> Daniel
> 
> 
> 
> Steve Lime wrote:
>> Hi all: I'm working to resolve bug 1921 with deals with curved labels and multibyte character sets. The current label path
>> code doesn't handle multibyte character sets. Fortunately standard C (on Linux and MacOS anyway) have some functions
>> to deal with the problem. For example mbstowcs converts from a multibyte string to a wide character string. The behavior
>> of those functions depend on the locale. Unless you are in a local with multibyte characters those functions don't work
>> with multibyte chars and assume single byte chars. Found that out trying to draw Chinese characters with the default US-EN 
>> locale. If I set the local to one with multibyte characters then I could debug.
>>
>> A couple of questions then:
>>
>>   - what would folks think about adding a LOCALE parameter to the main mapObj? We could to a setlocale(LC_ALL, msyytext)
>> immediately. If not set, then whatever the system has set takes over.
>>
>>   - as I understand it setlocale has a global effect so perhaps there might be some down stream side effects, although I can't
>> think what they might be. I suppose for long running processes there could be issues. I can see where a company might want
>> to provide maps in different languages from the same server.
>>
>>   - are the wide character functions (stdlib.h) available on all platforms?
>>
>> Steve 
> 
> 


-- 
Daniel Morissette
http://www.mapgears.com/



More information about the mapserver-dev mailing list