[mapserver-users] Arabic text help - some words not rendering correctly
Stephen Woodbridge
woodbri at swoodbridge.com
Wed Jan 5 13:34:30 PST 2011
On 1/5/2011 12:48 PM, Ian Walberg wrote:
> Hello list and Happy New Year,
>
> We are seeing incorrectly rendered Arabic text, typically a 'square'
> displayed instead of one of the characters or sometimes one or more
> characters missing.
>
> The data is coming from shape files and most the names appear to be
> displayed correctly.
>
> This is being seen on both ms4w and the target Linux installation.
>
> Any idea where to look would be greatly appreciated. Is the 'square'
> character what Freetype displays if it cannot find a character in the
> font.
Hi Ian,
The square that is displayed is the default character glyph used when a
requested character can not be found in the character set being used.
I looked at the sample images the your emailed me directly and here are
my thoughts on the problem:
1. encoding is ok, because 99.9% of the characters are displayed correctly
Possible causes for what you are seeing:
2. you have random characters that are not valid utf8 characters - but
this is probably refuted by the fact that in other programs it is
display correctly.
One way to easily verify this is to do a dbfdump of the dbf file in the
shapefile like:
dbfdump myshapfile.dbf > myshapfile.txt
Then view myshapfile.txt in firefox where you can change the character
encoding from the menu:
View -> Character Encoding -> ...
You can change it until you find the one that looks correct. This is
also a good way to determine the ENCODING value for the mapfile if you
are unsure. Also if your data is mangled because it has mixed character
data within an attribute column or the data was encoded badly it will
most likely look like garbage.
3. the font you are using with mapserver is not the same font used by
other programs. Hence the other programs have the glyph in their font
and mapserver does not
4. typically in utf8 text, only the "content" characters are stored and
it is up to the rendering program to determine if it left-right ot
right-left rendering and whether or not some adjacent characters need to
be joined in some manner. This joining process is typically different in
different applications. So while mapserver might be failing in this
regard (more on that below), may be your other application is doing a
better job.
So my pick for the likely issue here is 3 or more likely 4. Mapserver
uses fribidi library to convert the utf8 "content" string into a utf8
"display" string, then asks freetype to render the "display" string
using you font. The difference between the "content" string and he
"display" string is the fact the fribidi is dealing with positioning of
the character for both RTL and LTR rendering and it is adding joining
characters that must be present in the font that you use to display the
string.
It is likely that whatever you comparison application is, it is not
using fribidi.
So the fact that mapserver is rendering most of your data correctly
means that you have things built correctly and you data and mapfile are
setup correctly. I think you will need to chase down this problem on the
fribidi list. If the issue a patch to the library then the MS4W team can
pick that up if they know about it and anyone on *NIX can pull the new
package and build it.
The other fix to this problem, again the fribidi list can probably help,
is to identify which glyph is missing and try to find someone that can
add a glyph for the missing character to your font.
As far as I can tell this is not a mapserver problem per say except that
it is annoyingly only showing its ugly square box on the maps we render ;)
I hope this helps you solve this problem.
-Steve W
More information about the MapServer-users
mailing list