[mapserver-users] Confirmation of status of UTF8 support, and where transcoding to Latin-1 may be happening.

Russell McOrmond russell at flora.ca
Tue Jan 6 10:12:30 PST 2009


   I hope people don't mind me posting as I learn things, hoping that it 
will spark some ideas from other people.

On Sun, 4 Jan 2009, Russell McOrmond wrote:

> Howard Butler wrote on November 24, 2008 @ 05:17 PM:

>> pictures looked right.  The problem might be as simple as the function 
>> msConvertWideStringToUTF8 being broken.  Here's where MapServer tries to 
>> convert it: 
>> http://trac.osgeo.org/mapserver/browser/trunk/mapserver/mapsde.c#L750
>
>
> The more I look at this, the more confused I get.

   I'm now back at the customer (Xmas break), and confirmed that I'm not 
even making use of this code.  The relevant strings are SE_STRING_TYPE and 
not SE_NSTRING_TYPE.  I have also confirmed ( msDebug() statements) that 
the characters coming out of SE_stream_get_string() are Latin-1 encoded, 
and not UTF-8 encoded.

   Our database person confirmed that the data is encoded as UTF-8 in the 
database.  This suggests to me that it is SDE itself or the libsde.so 
client library that is doing the transcoding to Latin-1

   I'm curious if anyone knows if libsde.so has an equivalent to Oracle's 
NLS_LANG environment variable?

   It seems the decision to use STRING rather than NSTRING came down to the 
label functions.  I haven't looked at those functions yet to determine if 
they need Latin-1, and thus this is why things are working with STRING 
(Which comes in as Latin-1) rather than NSTRING (Which the code suggests 
should be transcoded to UTF-8).


   Another issue, this time with iconv.  I know this is not a mapserver 
issue, but it is possible that someone has seen something similar.

   We created some tables with strings in NSTRING.  I then got the error 
"msConvertWideStringToUTF8(): General error message. Encoding not 
supported by libiconv(UTF-16)"

   I'm a bit stuck as any call to iconv_open() returns -1, no matter what I 
put for the from and to.   The manual for iconv_open says that if it 
returns -1 that it sets errno, but it doesn't change the value.

         errno = 1;
         cd = iconv_open("ISO-8859-1", "UTF-8");
         msDebug("errno= %d cd=%d\n",errno,cd);

errno= 1 cd=-1

         errno = 5;
         cd = iconv_open("UTF-8", "UTF-16");
         msDebug("errno= %d cd=%d\n",errno,cd);

errno= 5 cd=-1


         errno = 123;
         cd = iconv_open("", "UTF-16");
         msDebug("errno= %d cd=%d\n",errno,cd);

errno= 123 cd=-1


I've tried loading the gnu iconv first,

export LD_PRELOAD=/server/ndevl18/apache-2.2.9/lib/preloadable_libiconv.so

ldd then shows that library first, but no difference.

   When I use the command line 'iconv' utility, it can convert from UTF-8 
to ISO-8859-1 with no problem.

   Has anyone seen a problem like this?

OS: Red Hat Enterprise Linux ES release 4 (Nahant Update 4)
ICONV: http://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.12.tar.gz

> familiarity at this point) figure out what is going.  It is a patch to 
> msEncodeHTMLEntities to encode these characters.  As entities they will work 
> as the browser won't care what encoding it thinks the page should be in.
>
>  I added the patch here: http://trac.osgeo.org/mapserver/ticket/2842

   Turns out that while this solves my getFeatureInfo problem, it 
introduces more problems.  The various .map files have strings in them 
that are UTF-8 encoded.  In this situatuation we have words like 
"générale" coming out as "générale" when we do a 
request=getcapabilities

   Seems things are never as simple as you first think.

-- 
  Russell McOrmond, Internet Consultant: <http://www.flora.ca/>
  Please help us tell the Canadian Parliament to protect our property
  rights as owners of Information Technology. Sign the petition!
  http://digital-copyright.ca/petition/ict/     http://KillBillC61.ca

  "The government, lobbied by legacy copyright holders and hardware
   manufacturers, can pry control over my camcorder, computer,
   home theatre, or portable media player from my cold dead hands!"


More information about the MapServer-users mailing list