[MAPSERVER-USERS] ASCII -> UTF-8 convert problems for importing (GIS) data (DBF standard encoding?)
Stefan Schwarzer
stefan.schwarzer at grid.unep.ch
Fri Apr 25 01:51:35 PDT 2008
>>>> hmm.... I have a shapefile, which has some unorthodox characters
>>>> (Ç,
>>>> ì, ...). Now, when importing the file (via shp2pgsql) into
>>>> postgres,
>>>> it complains about it not being UTF-8 (my database has that
>>>> format).
>>>>
>>>> So, how can I convert either the dbf file or than in a later stage
>>>> the
>>>> created text file from (I guess) ASCII into UTF-8?
>>
> -W describes the input format. The output format if you use it will
> be
> UTF-8. From the shp2pgsql(1) man page:
>
> You need to find out what the input data is encoded in. A very likely
> candidate is ISO-8859-1 (aka Latin-1).
>
> Take a look at the actual hex values of some of the non-English
> characters.
> (I use hexl-mode in emacs to do this, but there are plenty of other
> ways.)
> 0xC7 LATIN CAPITAL C WITH CEDILLA
> 0xEC LATIN SMALL I WITH GRAVE
>
> Do they match? But this is still a bit of a guessing game, because
> you
> could find many matches and still not be right, e.g. ISO-8859-15 is
> very
> similar. A better way would be to look at the documentation for
> your input
> data, or ask the provider of the data.
Thanks for the help. As you said, there is a lot of guessing
involved... But: is there anyway to specify the encoding for DBF
files? When dealing with shapefiles, the dbf is either created a bit
manually in Excel, or via ESRI programs (or other tools). I haven't
seen any possibility to specify the encoding. So I guess that all of
them have the same encoding, no?
More information about the MapServer-users
mailing list