[mapserver-users] Encoding issues

Tamas Szekeres szekerest at gmail.com
Wed Mar 4 17:24:42 EST 2009


Hi,

I don't know much about the hindi character sets.
I guess you could extent that byte array to string copy function with
arbitrary character sizes, like for double bytes something like:

for (int i = 0; i < bytes.Length; i=i+2)
                s.Append(Convert.ToChar(bytes[i] + 256*bytes[i+1]));

Best regards,

Tamas



2009/3/4 Murty Maganti <MMaganti at oriongis.com>

>  Hi Tamas
>
>
>
> This is still not working for some of the Asian languages.
>
>
>
> I suspect the issue could be in this line of your sample code below
>
> s.Append(Convert.ToChar(bytes[i]));
>
>
>
> Here, one single byte is used  to convert to a character. But my
> understanding is that UTF-8 can consume from 1 to 4 bytes to represent one
> character code point. It worked fine for Arabic may be because all Arabic
> characters can be represented using a single byte.
>
>
>
> When I tried the same code below with Hindi, an Indian language, some of
> the characters are shown junk (but not all characters). I guess those
> characters which occupy more than a byte turned out to be junk.
>
>
>
> I am also trying the opposite of the sample code below i.e. read a field
> value from map server (shapeObj.values), which is in Hindi, and show on a
> web page, again it turns out to be junk. I tried to look at the byte values
> of the string in VS by using
>
>
>
> Byte[] bites = Encoding.Unicode.GetBytes(shapeObj.values[0])
>
>
>
> I notice that they are actually code point of UTF-8 but interpreted as
> UTF-16 and may be the reason I see junk characters on web page. But I don’t
> know how to extract those UTF-8 byte values from UTF-16.
>
>
>
> I am just giving sample code here to explain
>
>
>
>                 byte[] utf16 = Encoding.Unicode.GetBytes("कीचनर"); //The
> text is in Hindi, an Indian language
>
>                 byte[] utf8 = Encoding.UTF8.GetBytes("कीचनर");
>
>
>
>                 shapeObj shape = layer.getFeature(result.shapeindex,
> result.tileindex);
>
>                 string value = shape.values[1]; //This contains the same
> text (in Hindi) as above in the shape file.
>
>
>
>                 byte[] bytes = Encoding.Unicode.GetBytes(value); //There
> are byte values of characters decoded from UTF-16. .Net internally stores
> all strings in UTF-16
>
>
>
> Now if I examine the values of ‘utf8’ and ‘bytes’ arrays
>
>
>
> utf8 – 224,164,149,224,165,128,224,164,154,224,164,168,224,164,176
>
> bytes – *224*,0,*164*,0,34,32,*224*,0,*165*,0,172,32,*224*,0,*164*,0,97,1,
> *224*,0,*164*,0,*168*,0,*224*,0,*164*,0,*176*,0
>
> utf16 – 21,9,64,9,26,9,40,9,48,9
>
>
>
> The first byte value is same as UTF-8. Second byte value is 0 as UTF-16
> takes atleast 2 bytes for a character. This gives me impression that the
> byte values are in UTF-8 and are not converted to UTF-16 to by .Net.
>
>
>
> Appreciate if you see any solution for this and let me know.
>
>
>
> Thanks
>
> Murty
>
>  *From:* Tamas Szekeres [mailto:szekerest at gmail.com]
> *Sent:* Friday, February 06, 2009 6:59 PM
>
> *To:* Murty Maganti
> *Cc:* mapserver-users at lists.osgeo.org
> *Subject:* Re: [mapserver-users] Encoding issues
>
>
>
> You might have to make explicit conversion maually something like:
>
>             string value = "لققافعععىىةةونه"; //I actually get this (in
> arabic) through user input
>             byte[] bytes = Encoding.Convert(Encoding.Unicode,
> Encoding.GetEncoding(1256), Encoding.Unicode.GetBytes(value));
>             StringBuilder s = new StringBuilder();
>             for (int i = 0; i < bytes.Length; i++)
>                 s.Append(Convert.ToChar(bytes[i]));
>             shpObj.text = s.ToString();
>
> Best regards,
>
> Tamas
>
>
>  2009/2/6 Murty Maganti <MMaganti at oriongis.com>
>
> HI
>
>
>
> I am doing a simple thing. I have a map file and trying to show some static
> text in Arabic on map. You can try this with any map file as it is nothing
> to do with layers from map.
>
>
>
> At run time (like on a button click), please add this
>
>
>
>                 layerObj lyr = new layerObj(mapObj);
>
>                 lyr.name = "TextAcetate";
>
>                 lyr.status = mapscript.MS_ON;
>
>                 lyr.type = MS_LAYER_TYPE.MS_LAYER_ANNOTATION;
>
>                 lyr.labelcache = mapscript.MS_TRUE;
>
>
>
>                 double locationX = 50;
>
>                 double locationY = 50;
>
>
>
> lyr.transform = (int)mapscript.MS_FALSE;
>
>
>
> classObj layerClass = new classObj(lyr);
>
>
>
> //All label properties
>
> layerClass.label.size = 15;
>
> layerClass.label.type = MS_FONT_TYPE.MS_TRUETYPE;
>
>>
>>
> layerClass.label.encoding = "CP1256";
>
>
>
>
>
>                 shapeObj shpObj = new shapeObj((int)MS_SHAPE_TYPE
> .MS_SHAPE_POINT);
>
>                 lineObj lnObj = new lineObj();
>
>
>
>                 pointObj pt = new pointObj(locationX, locationY, 0, 0);
>
>                 lnObj.add(pt);
>
>
>
>                 shpObj.add(lnObj);
>
>
>
>                 shpObj.text = "لققافعععىىةةونه"; //I actually get this (in
> arabic) through user input
>
>
>
>                 lyr.addFeature(shpObj);
>
>
>
> mapObj.draw(); //Onto a picture box or save as file
>
>
>
> (In the map file, my output format is set to GD/PNG)
>
>
>
> Please let me know if you need more information.
>
>
>
> Thanks
>
> Murty
>
>
>
>
>
> *From:* mapserver-users-bounces at lists.osgeo.org [mailto:
> mapserver-users-bounces at lists.osgeo.org] *On Behalf Of *Tamas Szekeres
> *Sent:* Friday, February 06, 2009 4:12 PM
>
>
> *To:* Murty Maganti
> *Cc:* mapserver-users at lists.osgeo.org
> *Subject:* Re: [mapserver-users] Encoding issues
>
>
>
> Please send me your example so that I could examine what's going on.
>
> Best regards,
>
> Tamas
>
>  2009/2/6 Murty Maganti <MMaganti at oriongis.com>
>
> Hi
>
>
>
> I tried with the suggested encoding but still no success.
>
> From the output below, I guess ICONV support is included.
>
>
>
> E:\Utils\MapServer\Map Server 5.2 RC\ms4w\Apache\cgi-bin>mapserv -v
>
> MapServer version 5.2.0 OUTPUT=GIF OUTPUT=PNG OUTPUT=JPEG OUTPUT=WBMP
> OUTPUT=PDF
>
>  OUTPUT=SWF OUTPUT=SVG SUPPORTS=PROJ SUPPORTS=AGG SUPPORTS=FREETYPE *
> SUPPORTS=ICO*
>
> *NV* SUPPORTS=FRIBIDI SUPPORTS=WMS_SERVER SUPPORTS=WMS_CLIENT
> SUPPORTS=WFS_SERVER
>
> SUPPORTS=WFS_CLIENT SUPPORTS=WCS_SERVER SUPPORTS=SOS_SERVER
> SUPPORTS=FASTCGI SUP
>
> PORTS=THREADS SUPPORTS=GEOS SUPPORTS=RGBA_PNG INPUT=JPEG INPUT=POSTGIS
> INPUT=OGR
>
>  INPUT=GDAL INPUT=SHAPEFILE
>
>
>
> Where can get some details on how to build the C# mapscript (Managed
> assembly only) from Visual Studio, keeping all unmanaged dlls from binaries
> from ms4w. I just want to give a try using MarshalAsAttribute.
>
>
>
> Thanks
>
> Murty
>
> *From:* Tamas Szekeres [mailto:szekerest at gmail.com]
> *Sent:* Friday, February 06, 2009 3:02 PM
> *To:* Murty Maganti
> *Cc:* mapserver-users at lists.osgeo.org
> *Subject:* Re: [mapserver-users] Encoding issues
>
>
>
> Hi,
>
> You might want to try with encoding="ISO-8859-6" assuming you have libiconv
> compiled in.
> The c# mapscript doesn't specify explicit conversion during the marshaling.
> In this case I assume an unicode to Charset.Ansi conversion will
> automatically takes place by default.
>
> Best regards,
>
> Tamas
>
>
>  2009/2/6 Murty Maganti <MMaganti at oriongis.com>
>
> Hello
>
>
>
> I am having some issues using Arabic text as labels. I am using C# map
> script. I am setting the following at runtime
>
>
>
> labelObj label = classObj.label;
>
> label.encoding = "CP1256";
>
> label.text = "some text in Arabic"; (At rune time in VS, I can see the text
> is actually in Arabic)
>
>
>
> But labels are displayed as '?????'.
>
>
>
>  Is there any conversion I need to do before setting the text value. How
> are the string represented in the underlying mapscript dll (ASCII or
> Unicode?). As I was reading in the MSDN, the default marshalling uses LPStr
> which is a single byte of ASCII. Does it mean that first I need to convert
> from Unicode to ASCII in C# before setting the value.
>
>
>
> Appreciate any help.
>
>
>
> Thanks
>
> Murty
>
>
>
>
> _______________________________________________
> mapserver-users mailing list
> mapserver-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/mapserver-users
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osgeo.org/pipermail/mapserver-users/attachments/20090304/5c715823/attachment-0001.html


More information about the mapserver-users mailing list