[mapserver-users] Confirmation of status of UTF8 support, and where transcoding to Latin-1 may be happening.
Russell McOrmond
russell at flora.ca
Tue Jan 6 10:12:30 PST 2009
I hope people don't mind me posting as I learn things, hoping that it
will spark some ideas from other people.
On Sun, 4 Jan 2009, Russell McOrmond wrote:
> Howard Butler wrote on November 24, 2008 @ 05:17 PM:
>> pictures looked right. The problem might be as simple as the function
>> msConvertWideStringToUTF8 being broken. Here's where MapServer tries to
>> convert it:
>> http://trac.osgeo.org/mapserver/browser/trunk/mapserver/mapsde.c#L750
>
>
> The more I look at this, the more confused I get.
I'm now back at the customer (Xmas break), and confirmed that I'm not
even making use of this code. The relevant strings are SE_STRING_TYPE and
not SE_NSTRING_TYPE. I have also confirmed ( msDebug() statements) that
the characters coming out of SE_stream_get_string() are Latin-1 encoded,
and not UTF-8 encoded.
Our database person confirmed that the data is encoded as UTF-8 in the
database. This suggests to me that it is SDE itself or the libsde.so
client library that is doing the transcoding to Latin-1
I'm curious if anyone knows if libsde.so has an equivalent to Oracle's
NLS_LANG environment variable?
It seems the decision to use STRING rather than NSTRING came down to the
label functions. I haven't looked at those functions yet to determine if
they need Latin-1, and thus this is why things are working with STRING
(Which comes in as Latin-1) rather than NSTRING (Which the code suggests
should be transcoded to UTF-8).
Another issue, this time with iconv. I know this is not a mapserver
issue, but it is possible that someone has seen something similar.
We created some tables with strings in NSTRING. I then got the error
"msConvertWideStringToUTF8(): General error message. Encoding not
supported by libiconv(UTF-16)"
I'm a bit stuck as any call to iconv_open() returns -1, no matter what I
put for the from and to. The manual for iconv_open says that if it
returns -1 that it sets errno, but it doesn't change the value.
errno = 1;
cd = iconv_open("ISO-8859-1", "UTF-8");
msDebug("errno= %d cd=%d\n",errno,cd);
errno= 1 cd=-1
errno = 5;
cd = iconv_open("UTF-8", "UTF-16");
msDebug("errno= %d cd=%d\n",errno,cd);
errno= 5 cd=-1
errno = 123;
cd = iconv_open("", "UTF-16");
msDebug("errno= %d cd=%d\n",errno,cd);
errno= 123 cd=-1
I've tried loading the gnu iconv first,
export LD_PRELOAD=/server/ndevl18/apache-2.2.9/lib/preloadable_libiconv.so
ldd then shows that library first, but no difference.
When I use the command line 'iconv' utility, it can convert from UTF-8
to ISO-8859-1 with no problem.
Has anyone seen a problem like this?
OS: Red Hat Enterprise Linux ES release 4 (Nahant Update 4)
ICONV: http://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.12.tar.gz
> familiarity at this point) figure out what is going. It is a patch to
> msEncodeHTMLEntities to encode these characters. As entities they will work
> as the browser won't care what encoding it thinks the page should be in.
>
> I added the patch here: http://trac.osgeo.org/mapserver/ticket/2842
Turns out that while this solves my getFeatureInfo problem, it
introduces more problems. The various .map files have strings in them
that are UTF-8 encoded. In this situatuation we have words like
"générale" coming out as "générale" when we do a
request=getcapabilities
Seems things are never as simple as you first think.
--
Russell McOrmond, Internet Consultant: <http://www.flora.ca/>
Please help us tell the Canadian Parliament to protect our property
rights as owners of Information Technology. Sign the petition!
http://digital-copyright.ca/petition/ict/ http://KillBillC61.ca
"The government, lobbied by legacy copyright holders and hardware
manufacturers, can pry control over my camcorder, computer,
home theatre, or portable media player from my cold dead hands!"
More information about the MapServer-users
mailing list