[mapserver-users] Arabic text help - some words not rendering correctly

Stephen Woodbridge woodbri at swoodbridge.com
Wed Jan 5 16:34:30 EST 2011


On 1/5/2011 12:48 PM, Ian Walberg wrote:
> Hello list and Happy New Year,
>
> We are seeing incorrectly rendered Arabic text, typically a 'square'
> displayed instead of one of the characters or sometimes one or more
> characters missing.
>
> The data is coming from shape files and most the names appear to be
> displayed correctly.
>
> This is being seen on both ms4w and the target Linux installation.
>
> Any idea where to look would be greatly appreciated. Is the 'square'
> character what Freetype displays if it cannot find a character in the
> font.

Hi Ian,

The square that is displayed is the default character glyph used when a 
requested character can not be found in the character set being used.

I looked at the sample images the your emailed me directly and here are 
my thoughts on the problem:

1. encoding is ok, because 99.9% of the characters are displayed correctly

Possible causes for what you are seeing:

2. you have random characters that are not valid utf8 characters - but 
this is probably refuted by the fact that in other programs it is 
display correctly.

One way to easily verify this is to do a dbfdump of the dbf file in the 
shapefile like:

   dbfdump myshapfile.dbf > myshapfile.txt

Then view myshapfile.txt in firefox where you can change the character 
encoding from the menu:

View -> Character Encoding -> ...

You can change it until you find the one that looks correct. This is 
also a good way to determine the ENCODING value for the mapfile if you 
are unsure. Also if your data is mangled because it has mixed character 
data within an attribute column or the data was encoded badly it will 
most likely look like garbage.

3. the font you are using with mapserver is not the same font used by 
other programs. Hence the other programs have the glyph in their font 
and mapserver does not

4. typically in utf8 text, only the "content" characters are stored and 
it is up to the rendering program to determine if it left-right ot 
right-left rendering and whether or not some adjacent characters need to 
be joined in some manner. This joining process is typically different in 
different applications. So while mapserver might be failing in this 
regard (more on that below), may be your other application is doing a 
better job.

So my pick for the likely issue here is 3 or more likely 4. Mapserver 
uses fribidi library to convert the utf8 "content" string into a utf8 
"display" string, then asks freetype to render the "display" string 
using you font. The difference between the "content" string and he 
"display" string is the fact the fribidi is dealing with positioning of 
the character for both RTL and LTR rendering and it is adding joining 
characters that must be present in the font that you use to display the 
string.

It is likely that whatever you comparison application is, it is not 
using fribidi.

So the fact that mapserver is rendering most of your data correctly 
means that you have things built correctly and you data and mapfile are 
setup correctly. I think you will need to chase down this problem on the 
fribidi list. If the issue a patch to the library then the MS4W team can 
pick that up if they know about it and anyone on *NIX can pull the new 
package and build it.

The other fix to this problem, again the fribidi list can probably help, 
is to identify which glyph is missing and try to find someone that can 
add a glyph for the missing character to your font.

As far as I can tell this is not a mapserver problem per say except that 
it is annoyingly only showing its ugly square box on the maps we render ;)

I hope this helps you solve this problem.

-Steve W


More information about the mapserver-users mailing list