[gdal-dev] Does the OGR PotsGis driver support by default UTF-8 ?

Even Rouault even.rouault at mines-paris.org
Wed May 6 14:59:44 EDT 2009


Matthieu,

the following script works like a charm for me:

#!/usr/bin/python
# -*- coding: utf-8 -*-
import ogr
ds = ogr.Open('PG:')
ds.ExecuteSQL("CREATE TABLE foo2 (col VARCHAR)")
ds.ExecuteSQL("INSERT INTO foo2 (col) VALUES ('é')")
ds.ExecuteSQL("DROP TABLE foo2")

You didn't specify which GDAL version you use, so if you use something older 
than GDAL 1.6, you could try upgrading.

About catching the errors :
- If the errors comes from Python itself before getting OGR, then you can 
catch errors with normal Python mechanisms.
- If it really comes from inside OGR, errors are reported by default on the 
standard error interface :
   * You can get the last error issued with gdal.GetLastErrorMsg().
   * You can also make GDAL quiet by using 
gdal.PushErrorHandler('CPLQuiertErrorHandler') / gdal.PopErrorHandler().
   * Otherwise, to be more Python'ish, you can add 'ogr.UseExceptions()' at 
the beginning of the script, and that will (generally) issue a Python 
exception each time an error occurs on OGR side.

GDAL/OGR can also be more verbose if you compile it with debug support and 
define the CPL_DEBUG environment variable when running your application.

Le Wednesday 06 May 2009 13:02:45 Matthieu Rigal, vous avez écrit :
> Hi Even,
>
> You were actually right, it comes from my script.
> I already defined it the way you suggested, I was reading values from a XML
> file defined as UTF-8 and the script was sending values as unicode string
> to the ExecuteSQL command.
>
> Even if it is UTF-8, it seems that the ExecuteSQL only handles String. I
> just had to make :
> sSql = sSql.encore('utf8')
> to solve the problem
>
> I don't know if it is a feature or a bug...
>
> A part from that, by inserting the commands via the function, I was
> disappointed that there is no possibility to catch the errors of this
> function by using try, except and that even redirecting StdOut or StdErr in
> Python through sys.stdout and sys.stderr won't allow me to write to a file
> the results of the SQL commands...
>
> Best regards,
> Matthieu
>
> On Tuesday 05 May 2009 20:44:54 Even Rouault wrote:
> > Matthieu,
> >
> > You didn't include the full stack trace that you got, so it's just a
> > guess. But it looks like more a problem with your use of non ASCII
> > characters in Python than an issue in GDAL itself and its support of
> > UTF-8 in the PostgreSQL driver. My knowledge of Python is rather weak,
> > but I'd advise you to avoid directly using non ASCII characters in
> > strings in your code, but rather encode them as hexadecimal sequences.
> > For example, '\xc3\xa9' is the UTF-8 encoding of the eacute character.
> >
> > Alternatively, you could refer to
> > http://www.python.org/dev/peps/pep-0263/ to define the encoding of your
> > Python source file.
> >
> > For example, the following script will display two eacute characters,
> > provided that your text editor is indeed using UTF-8 encoding.
> >
> > #!/usr/bin/python
> > # -*- coding: utf-8 -*-
> > print '\xc3\xa9'
> > print 'é'
> >
> > Best regards,
> >
> > Even
> >
> > Le Tuesday 05 May 2009 13:07:47 Matthieu Rigal, vous avez écrit :
> > > Hi all,
> > >
> > > I have a problem with using the ExecuteSQL command of ogr within
> > > Python... My version of GDAL/OGR is 1.5.1
> > >
> > > I first open the connexion normally, without problem, with my UTF-8
> > > Database. I sent some insert commands with ascii values without
> > > problem, there are added and taken into account.
> > >
> > > BUT when I send the query beginning with :
> > > "INSERT INTO wheat09.fields VALUES
> > > ('687a86d7-8989-4572-a75b-d6b4e9a469b8', 'FORGERET 1', 2009, 0.00,
> > > 10.07, 'profond', 'Blé tendre', [...]"
> > > I have a crash in the ExecuteSQL function of ogr.py, line 343, that is
> > > expecting 'ascii', on the character 122 "é".
> > >
> > > The string given to the ExecuteSQL function is of type Unicode String.
> > >
> > > I don't want ExecuteSQL to expect ascii, but UTF-8. And from what I
> > > read from the GDAL homepage, it should be UTF-8....
> > >
> > >
> > > I could read in the drv_pg.html page that :
> > > "By default it is assumed that text being sent to Postgres is in the
> > > UTF-8 encoding. This is fine for plain ASCII, but can result in errors
> > > for extended characters (ASCII 155+, LATIN1, etc). While OGR provides
> > > no direct control over this, you can set the PGCLIENTENCODING
> > > environment variable to indicate the format being provided. For
> > > instance, if your text is LATIN1 you could set the environment variable
> > > to LATIN1 before using OGR and input would be assumed to be LATIN1
> > > instead of UTF-8."
> > >
> > >
> > > Any suggestions or help is highly appreciated !
> > >
> > > Regards,
> > > Matthieu




More information about the gdal-dev mailing list