[gdal-dev] Re: WFS and -where with non-ASCII characters

Even Rouault even.rouault at mines-paris.org
Tue Jan 3 09:43:53 EST 2012


Selon Rahkonen Jukka <Jukka.Rahkonen at mmmtike.fi>:

>
> Mateusz Łoskot wrote:
>
> > Jukka Rahkonen wrote:
> > > I took the successful query sent by Ari from the TinyOWS
> > log and copied it
> > > literally into Windows and this way it works:
> > >
> > > -where name='Hämeenkylä'
> >
> > Windows Command Prompt can work with UTF-8 characters if you change
> > codepage to UTF-8:
> >
> > 0) Open new prompt (cmd.exe)
> > 1) Change font to Lucida Concole
> > 3) chcp 65001
> >
> > And OGR can consume filter without problems:
> >
> > -where "name=\"Hämeenkylä\""
> >
> > Note, the \"\" is needed to not to confuse OGR SQL compilers,
> > otherwise value Hämeenkylä
> > will be parsed as OGR SQL type SNT_COLUMN instead of SNT_CONSTANT for
> > field value.
> >
> > However, I think the problem may be with TinyOWS. It throws error;
> >
> > <ows:ExceptionText>QUERY_STRING contains forbidden
> > characters</ows:ExceptionText>
> >
> > which is generated by TinyOWS:
> >
> > http://www.tinyows.org/trac/browser/trunk/src/struct/cgi_reque
> > st.c?rev=525#L208
> >
> > where TinyOWS simply tests characters passed in request against fixed
> > range: A-Za-zà-ÿ
> > Comparing extended ASII codes, the value 'ä' is outside of
> > this range anyway.
> >
> > I get no WFS exception no OGR error when querying with some (not all)
> > Polish diacritics:
> >
> > ogrinfo WFS:http://hip.latuviitta.org/cgi-bin/tinyows
> > lv:pks_tilastoalue_piste -where "name=\"Ä
ęśćł\""
> >
> > Certainly, it gives empty resultset.
> >
> > I think it would be a good idea to try against different WFS server.
>
> I followed your example but changing the font and chcp 65001 did not
> actually change anything as fas as I can see. OGR may consume
> -where "name=\"Hämeenkylä\"" OK but as you said but TinyOWS denies it.
> However,  -where name='Hämeenkylä' gives correct result. But
> it gave correct result even before changing the font and codepage.
>
> TinyOWS log shows your -where "name=\"Ä
ęśćł\"" like "aescl" but I am not
> sure if the characters have changed or if my console just shows them
> as ascii characters.
>
> Mapserver behaves also as it did before. My codepage is now 65001 and
> -where "name=\"Hämeenkylä\"" gives http 500 error while
> -where name='Hämeenkylä' gives correct result.

Yes, your observation confirms my little testing. Mateusz' trick with chcp
indeed fixes the display of UTF-8 characters in the console, but when I enter an
accentuated character, the command line utilities consume it as Latin1.
Note: I'm on Windows xp.

I've verified it with a trivial code compiled with MSVC :

int main(int argc, char* argv[])
{
   printf("%d\n", strlen(argv[1]));
   return 0;
}

If I try "test éven", it prints 4, whereas it should print 5 if it was really
UTF-8.

>
> -Jukka Rahkonen-
>
>
>
>




More information about the gdal-dev mailing list