[postgis-users] Mixed UTF-8 and LATIN1 in one Database

Paul Ramsey pramsey at opengeo.org
Thu Jul 7 18:29:20 PDT 2011


The database has one encoding (UTF-8 by default)

Choosing the shapeloader encoding allows the loader to transcode the
input data (LATIN1) into the database encoding (UTF-8).

If it helps you, as a GIS guy, think of encodings as map projections.
Just as the coordinates of a file do not make sense without knowing
the map projection, so the bytes in a string do not make sense without
knowing the encoding. (is 145 'e egu?' 'e grave?')

The end result of having a database declared in UTF8 and selecting
your input encoding as LATIN1 (so the loader can transcode into UTF8)
is a database full of UTF8 strings. All good news.

P.

On Thu, Jul 7, 2011 at 5:52 PM, Samuel Smith <samuel at groundlevel.ca> wrote:
> Hey Listers,
>
>
>
> I’ve run into this ~issue~ times now but haven’t bothered to investigate its
> implications until now.
>
>
>
> When importing SHP into PostGIS (using pgShapeLoader) the import has asked
> me to use LATIN1 encoding instead of the default UTF8. It’s easy to switch
> in the Options dialog and the subsequent import(s) work.
>
>
>
> So …
>
>
>
> ·         Has this re-encoded my LATIN1 data into UTF8?, or
>
> ·         Do I now have mixed encodings within one DB?
>
> ·         (Yes I believe in magic and Santana)
>
>
>
> And …
>
>
>
> ·         Do I need to worry about losing characters in a re-encoding?, or
>
> ·         Specially handling mixed encodings if this is even possible?, or
>
> ·         Is it all going to work out in the end?
>
>
>
> Cheers, Sam
>
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
>



More information about the postgis-users mailing list