[gdal-dev] Re: WFS and -where with non-ASCII characters
even.rouault at mines-paris.org
Tue Jan 3 09:31:22 EST 2012
Selon Mateusz Åoskot <mateusz at loskot.net>:
> On 3 January 2012 13:07, Ari Jolma <ari.jolma at gmail.com> wrote:
> > On 01/03/2012 02:45 PM, Mateusz Åoskot wrote:
> >> On 3 January 2012 11:07, Jukka Rahkonen<jukka.rahkonen at mmmtike.fi>
> Â wrote:
> >>> I took the successful query sent by Ari from the TinyOWS log and copied
> >>> it literally into Windows and this way it works:
> >>> -where name='HÃÂ¤meenkylÃÂ¤'
> >> Windows Command Prompt can work with UTF-8 characters if you change
> >> codepage to UTF-8:
> >> 0) Open new prompt (cmd.exe)
> >> 1) Change font to Lucida Concole
> >> 3) chcp 65001
> >> And OGR can consume filter without problems:
> >> -where "name=\"HÃ¤meenkylÃ¤\""
> >> Note, the \"\" is needed to not to confuse OGR SQL compilers,
> >> otherwise value HÃ¤meenkylÃ¤
> >> will be parsed as OGR SQL type SNT_COLUMN instead of SNT_CONSTANT for
> >> field value.
> > Is that really so?
> I have checked the two variants under debugger and that's what I see,
> as far as I look at right place.
> > At least in PostgreSQL " and ' have different uses. " is
> > used for column names, which are not all lowercase and without special
> > characters and ' is used for string constants (as in this case).
> Perhaps parser gets confused by extended ASCII or non-ASCII characters,
> then the meaning of " and ' is affected.
The OGR SQL dialect allows " and ' to be used indifferently for string literals.
However the SQL standard (or at least the implementations I'm familiar like
sqlite, postgresql) only uses ' for string literal and " for column/table names.
I'd discourage anyone from using " for string literals with OGR SQL. Because
ultimately it would be good to be stricter in order to be able to distinguish
column names that are quoted because they contain a special character from
string literals. Currently the 2 following tests would be interpreted the same
1) a_column = "a column with &( weird characters"
2) a_column = 'a literal with &( weird characters'
That is to say that in case 1) we would consider the right part of the
comparison as a literal value and not a column name.
I have created some time ago a patch to start implementing that stricter mode (
http://trac.osgeo.org/gdal/ticket/4280 ) but it can break existing uses of OGR,
so perhaps material for GDAL 2.0.
> Best regards,
> Mateusz Loskot, http://mateusz.loskot.net
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
More information about the gdal-dev