[gdal-dev] Does the OGR PotsGis driver support by default UTF-8 ?

Even Rouault even.rouault at mines-paris.org
Tue May 5 14:44:54 EDT 2009


Matthieu,

You didn't include the full stack trace that you got, so it's just a guess. 
But it looks like more a problem with your use of non ASCII characters in 
Python than an issue in GDAL itself and its support of UTF-8 in the 
PostgreSQL driver. My knowledge of Python is rather weak, but I'd advise you 
to avoid directly using non ASCII characters in strings in your code, but 
rather encode them as hexadecimal sequences. For example, '\xc3\xa9' is the 
UTF-8 encoding of the eacute character.

Alternatively, you could refer to http://www.python.org/dev/peps/pep-0263/ to 
define the encoding of your Python source file.

For example, the following script will display two eacute characters, provided 
that your text editor is indeed using UTF-8 encoding.

#!/usr/bin/python
# -*- coding: utf-8 -*-
print '\xc3\xa9'
print 'é'

Best regards,

Even

Le Tuesday 05 May 2009 13:07:47 Matthieu Rigal, vous avez écrit :
> Hi all,
>
> I have a problem with using the ExecuteSQL command of ogr within Python...
> My version of GDAL/OGR is 1.5.1
>
> I first open the connexion normally, without problem, with my UTF-8
> Database. I sent some insert commands with ascii values without problem,
> there are added and taken into account.
>
> BUT when I send the query beginning with :
> "INSERT INTO wheat09.fields VALUES
> ('687a86d7-8989-4572-a75b-d6b4e9a469b8', 'FORGERET 1', 2009, 0.00,
> 10.07, 'profond', 'Blé tendre', [...]"
> I have a crash in the ExecuteSQL function of ogr.py, line 343, that is
> expecting 'ascii', on the character 122 "é".
>
> The string given to the ExecuteSQL function is of type Unicode String.
>
> I don't want ExecuteSQL to expect ascii, but UTF-8. And from what I read
> from the GDAL homepage, it should be UTF-8....
>
>
> I could read in the drv_pg.html page that :
> "By default it is assumed that text being sent to Postgres is in the UTF-8
> encoding. This is fine for plain ASCII, but can result in errors for
> extended characters (ASCII 155+, LATIN1, etc). While OGR provides no direct
> control over this, you can set the PGCLIENTENCODING environment variable to
> indicate the format being provided. For instance, if your text is LATIN1
> you could set the environment variable to LATIN1 before using OGR and input
> would be assumed to be LATIN1 instead of UTF-8."
>
>
> Any suggestions or help is highly appreciated !
>
> Regards,
> Matthieu




More information about the gdal-dev mailing list