[MAPSERVER-USERS] ASCII -> UTF-8 convert problems for importing (GIS) data (DBF standard encoding?)

Stefan Schwarzer stefan.schwarzer at grid.unep.ch
Fri Apr 25 01:51:35 PDT 2008


>>>> hmm.... I have a shapefile, which has some unorthodox characters  
>>>> (Ç,
>>>> ì, ...). Now, when importing the file (via shp2pgsql) into  
>>>> postgres,
>>>> it complains about it not being UTF-8 (my database has that  
>>>> format).
>>>>
>>>> So, how can I convert either the dbf file or than in a later stage
>>>> the
>>>> created text file from (I guess) ASCII into UTF-8?
>>
> -W describes the input format.  The output format if you use it will  
> be
> UTF-8.  From the shp2pgsql(1) man page:
>
> You need to find out what the input data is encoded in.  A very likely
> candidate is ISO-8859-1 (aka Latin-1).
>
> Take a look at the actual hex values of some of the non-English  
> characters.
> (I use hexl-mode in emacs to do this, but there are plenty of other  
> ways.)
> 0xC7 LATIN CAPITAL C WITH CEDILLA
> 0xEC LATIN SMALL I WITH GRAVE
>
> Do they match?  But this is still a bit of a guessing game, because  
> you
> could find many matches and still not be right, e.g. ISO-8859-15 is  
> very
> similar.  A better way would be to look at the documentation for  
> your input
> data, or ask the provider of the data.

Thanks for the help. As you said, there is a lot of guessing  
involved... But: is there anyway to specify the encoding for DBF  
files? When dealing with shapefiles, the dbf is either created a bit  
manually in Excel, or via ESRI programs (or other tools). I haven't  
seen any possibility to specify the encoding. So I guess that all of  
them have the same encoding, no?


More information about the MapServer-users mailing list