Java Mapscript - querybyattribut

Umberto Nicoletti umberto.nicoletti at GMAIL.COM
Fri Apr 21 10:59:02 EDT 2006


This issue will be tracked by bugzilla issue 1753:

http://mapserver.gis.umn.edu/bugs/show_bug.cgi?id=1753

I have already posted a proposed patch which needs extensive testing.
The patch goes to some extent in the direction pointed by Benedikt
(thanks!), but is more flexible as it is able to use other encodings
beside iso-8859 (and our friends in Japan will be happy, but they need
to test it before they smile and say thanks). What encoding is used
depends on your enviroment (LANG and his LC_* friends for UNIX folks).

A not so short note on the dbf file we are using to test: they were
probably created as iso-8859-15 and cannot be used when in utf-8 mode.
Please note that this is *not* a problem with mapserver itself, but
rather with the dbf file itself: when using utf-8 the glibc attempts
to handle ALL characters (including strings in the dbf files) as
unicode and that is the reason matching fails. I attach an UTF8
version I have created for personal purposes of the subset.dbf so that
you can verify if I am barking up the right tree.

To go into UTF8 mode:
export LANG=de_DE.UTF-8
export LC_ALL=de_DE.UTF-8
export LC_CLANG=de_DE.UTF-8
... for all LC vars


To go into ISO8859 mode:
export LANG=de_DE.ISO-8859-1
export LC_ALL=de_DE.ISO-8859-1
export LC_CLANG=de_DE.ISO-8859-1
... for all LC vars

Best regards and happy coding,
Umberto


On 4/20/06, Umberto Nicoletti <umberto.nicoletti at gmail.com> wrote:
> Please open an issue in bugzilla. I'll add some debugging to mapserver
> in cvs and then we'll start from there.
>
> See you tomorrow.
>
> Best regards,
> Umberto
>
> On 4/20/06, Benedikt Rothe <umn-ms at hydrotec.de> wrote:
> >
> > Hi all
> >
> > For experimental purpose I added a "UTF-8 -> ISO-8859-Conversion"
> > in two functions of mapscript_wrap.c
> > The functions are the JNI-Implementations of mapObj.getLayerByName and
> > layerObj.queryByAttributes.
> >
> > I testet with the QueryByAttribute-Program Umberto send and with a Shapefile
> > (no database).
> > I tested to find the layer named "ÜÄÖßäüö" and in this layer the
> > regular-expression
> > "Süden".
> >
> > On my Windows-machine the results seem to be correct for the Command-Line-
> > parameter-case and for the hardcoded-case.
> >
> > The implementation is experimental only. If strings are longer than 500
> > chars,
> > unforeseeable things may happen. In practice one should think twice wether
> > to
> > use this kind of implementation or to use the build-in Java-encoders from
> > JNI level.
> >
> > But the really difficult question is how anything like this could be
> > reasonably incorporated
> > with swig. I have no idea!
> >
> > Umberto? What do you think?
> >
> > Norbert: To be honorable, I don't think it is promising to change the
> > Strings in the
> > higher Java-level.
> >
> > For some days code with example can be dowloaded from
> > http://www2.hydrotec.de/webdemos/be/umlaute.zip
> > (Compiled mapscript.dll is included. Other dll's can be used from normal
> > mapserver 4.8.2)
> >
> > Benedikt
> >
> > PS: I think of this as a bug in mapserver. Should a bug in bugzilla be
> > opened?
> >
> > listuser HH <listuser at herzsys.de> schrieb am 19.04.2006 10:05:13:
> >
> > > Hi Benedikt,
> > >
> > > thanks for the interest. I have the encoding problem when I try to
> > > "getLayerByName()". I could get around the problem at this point but I
> > > think I will have the problem later again. So testing is easy - I just
> > > use a simple layer with the name "Regierungspräsidien". When I use this
> > > string to get the layer by name it doesn't work. To be sure I tried to
> > > check this string with the name of the layer which I get from
> > > map.getLayer(0) - which is equal to the other.
> > >
> > > My thought was to convert the string before using it in mapscript
> > > functions. Java brings two things to convert strings - perhaps there are
> > > more.
> > >
> > > 1. make a new String from the old in a special encoding - e.g. --> new
> > > String(oldString.getBytes(), "ISO8859-1")
> > > 2. convert chars to different encoding - e.g. [snip] -->
> > >                         CharToByteISO8859_1 conv4ISO = new
> > > CharToByteISO8859_1();
> > >                         char[] cs = layerName.toCharArray();
> > >                         conv4ISO.convert(charArray, 0, charArray.length,
> > > byteArray, 0, byteArray.length);
> > >                         new String(byteArray);
> > >
> > > Because I'm not sure about which encodings are used at wich stpes I
> > > tried some combinations but without luck. I think my code from eclipse
> > > is CP1252. In the eclipse editor properties I changed this to UTF8 and
> > > ISO which also didn't work. I tried to I'm not sure what happens when
> > > mapscript use JNI. Perhaps the string gets converted to UTF8. If this is
> > > right I see no chance for me to change the string in java because it
> > > gets converted even if it is already UTF8. I have to say that I'm not
> > > familiar with this encoding things. If someone has an advice I will
> > > going on testing.
> > >
> > > I think it will be a good thing to have "UTF-8 -> ISO-8859-Conversion"
> > > like you suggested. At the moment I can't do this because I have no
> > > possibility to compile the c code.
> > >
> > > Best regards,
> > >
> > > Norbert
> > >
> > > Benedikt Rothe wrote:
> > >
> > > >Umberto, Nicol, Norbert, Oliver
> > > >
> > > >Umberto wrote
> > > >
> > > >
> > > >>try to run the attached Java source.
> > > >>
> > > >>
> > > >...
> > > >
> > > >
> > > >>"Südliche Weinstraße" as the second it will work!
> > > >>
> > > >>
> > > >
> > > >In my copy of your mail the queryByAttribute.java-program is not
> > > >attached.  Could somebody post program including testdata (or
> > > >download-url) ?
> > > >
> > > >I 'd like to study a running example, because I don't understand
> > > >how umlaut-conversion from Java to Mapserver-kernel can run
> > > >properly anyway and I'd like to understand it :-)
> > > >-----------
> > > >It seems Norbert found a kind of answer to his question
> >
> > > >
> > > >
> > > >>is there a way to do the converion in java dircetly?
> > > >>
> > > >>
> > > >He suggested
> >
> > > >
> > > >
> > > >>Try to convert the String before you set the expression in your code(->
> > > >>
> > > >>
> > > >String( byte
> > > >
> > > >
> > > >>bytes[], String )
> > > >>
> > > >>
> > > >Could you be more precisly? I do not understand what must be converted to
> > > >what.
> > > >How must this be applied to convert a Java-String to a proper
> > > >"Mapserver-String" (?)
> > > >-----------
> > > >Is somebody willing to try to add an "UTF-8 -> ISO-8859-Conversion" in
> > > >mapscript_wrap.c for testpurposes? (Even in the case it works, this would
> > > >not be a real solution because it bypasses swig.)
> > > >
> > > >Benedikt
> > > >
> > > >
> > > >UMN MapServer Users List <MAPSERVER-USERS at LISTS.UMN.EDU> schrieb am
> > > >14.04.2006 15:23:51:
> > > >
> > > >
> > > >
> > > >>Olivier,
> > > >>I GOT IT!
> > > >>
> > > >>try to run the attached Java source. If you pass it two arguments the
> > > >>first being the path to the map file and the second the string to
> > > >>search for and you pass
> > > >>"Südliche Weinstraße" as the second it will work!
> > > >>
> > > >>So why does it fail when "Südliche Weinstraße" is inside the Java
> > > >>code? That is a problem that only happens when javac compiles the
> > > >>source: javac translates all characters to unicode and in doing that
> > > >>it gets the german characters wrong.
> > > >>To solve this give javac the following option: -source 1.4
> > > >>
> > > >>For more see this link:
> > >
> > >>http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5046139
> >
> > > >>
> > > >>On 4/13/06, Umberto Nicoletti <umberto.nicoletti at gmail.com> wrote:
> > > >>
> > > >>
> > > >>>This is probably not related only to java mapscript, so please read
> > > >>>
> > > >>>
> > > >on.
> > > >
> > > >
> > > >>So I was wrong...but I'll leave the proof to the reader ;-)
> > > >>
> > > >>Best regards,
> > > >>Umberto
> >
> > > >>
> > > >>
> > > >>
> > > >>>On 3/30/06, Oliver Wesp <wesp at gdv.com> wrote:
> > > >>>
> > > >>>
> > > >>>>Dear List,
> > > >>>>
> > > >>>>I' struggling with queryByAttributes on an attribute field with
> > > >>>>
> > > >>>>
> > > >german
> > > >
> > > >
> > > >>>>umlauts using java mapscript.
> > > >>>>The odd thing is that the same thing works fine with php mapscript
> > > >>>>
> > > >>>>
> > > >and
> > > >
> > > >
> > > >>>>when I use expressions in my  mapfile. I'm using a shapefile as
> > > >>>>
> > > >>>>
> > > >>datasource.
> >
> > > >>
> > > >>
> > > >>>Could someone of the other mapserver developers shed some light on
> > > >>>
> > > >>>
> > > >>this issue?
> >
> > > >>
> > > >>
> > > >>>I have a clue to give: php mapscript is using a different regex
> > > >>>library and this explains why the match does not happen for Java
> > > >>>mapscript, while it does happen in php mapscript. If I am right also
> > > >>>the mapserver cgi should be affected and possibly all other mapscript
> > > >>>too.
> > > >>>
> > > >>>It would be very interesting if someone could report on similar
> > > >>>experiences with the cgi-bin version of mapserver.
> > > >>>
> > > >>>Thanks,
> > > >>>Umberto
> > > >>>
> > > >>>
> > > >>>
> > > >>>>Here is what I do:
> > > >>>>
> > > >>>>layer.queryByAttributes(map,"KREIS_NAME", "/Südliche Weinstraße/",
> >
> > > >>>>mapscriptConstants.MS_MULTIPLE);
> > > >>>>layer.open();
> > > >>>>System.out.println( "Result Count: " +layer.getNumResults() );
> > > >>>>layer.close();
> > > >>>>
> > > >>>>The result is always null while replacing the qstring with something
> > > >>>>that doesn't contain special characters (e.g.
> > > >>>>'Mainz-Bingen') works fine.
> > > >>>>
> > > >>>>As noted above the following layer definition in a mapfile works
> > > >>>>
> > > >>>>
> > > >fine
> > > >
> > > >
> > > >>>>LAYER
> > > >>>>     NAME kreis
> > > >>>>     STATUS DEFAULT
> > > >>>>     TYPE polygon
> > > >>>>     DATA "/tmp/subset"
> > > >>>>     TEMPLATE "kreis.html"
> > > >>>>     CLASSITEM KREIS_NAME
> > > >>>>     CLASS
> > > >>>>       NAME Boundary
> > > >>>>       COLOR 128 128 0
> > > >>>>       OUTLINECOLOR 0 0 0
> > > >>>>       EXPRESSION /Südliche Weinstraße/
> >
> > > >>>>     END
> > > >>>>END
> > > >>>>
> > > >>>>
> > > >>>>but this does not:
> > > >>>>
> > > >>>>layer.setClassitem("KREIS_NAME");
> > > >>>>classObj cl = new classObj(layer);
> > > >>>>cl.setName("Classname");
> > > >>>>cl.setExpression("/Südliche Weinstraße/");
> >
> > > >>>>
> > > >>>>I use Mapserver 4.8.1 on W2k, Tomcat 5.0.28.
> > > >>>>
> > > >>>>I can provide some sample data, just in case someone likes to
> > > >>>>
> > > >>>>
> > > >reproduce.
> > > >
> > > >
> > > >>>>Any help is appreciated.
> > > >>>>
> > > >>>>best regards
> > > >>>>Oliver
> > > >>>>--
> > > >>>>Dipl.-Geogr. Oliver Wesp
> > > >>>>Gesellschaft fuer geografische Datenverarbeitung
> > > >>>>Binger Strasse 49-51
> > > >>>>D-55218 Ingelheim
> > > >>>>fon: +49 6132 714818
> > > >>>>fax: +49 6132 714828
> > > >>>>http: www.gdv.com
> > > >>>>
> > > >>>>
> > > >>>>
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
> > > >
> > > >
> > > >>is there a way to do the converion in java dircetly?
> > > >>
> > > >>
> > > >He suggested
> >
> > > >
> > > >
> > > >>Try to convert the String before you set the expression in your code(->
> > > >>
> > > >>
> > > >String( byte
> > > >
> > > >
> > > >>bytes[], String )
> > > >>
> > > >>
> > > >Could you be more precisly? I do not understand what must be converted to
> > > >what.
> > > >How must this be applied to convert a Java-String to a proper
> > > >"Mapserver-String" (?)
> > > >-----------
> > > >Is somebody willing to try to add an "UTF-8 -> ISO-8859-Conversion" in
> > > >mapscript_wrap.c for testpurposes? (Even in the case it works, this would
> > > >not be a real solution because it bypasses swig.)
> > > >
> > > >Benedikt
> > > >
> > > >
> > > >UMN MapServer Users List <MAPSERVER-USERS at LISTS.UMN.EDU> schrieb am
> > > >14.04.2006 15:23:51:
> > > >
> > > >
> > > >
> > > >>Olivier,
> > > >>I GOT IT!
> > > >>
> > > >>try to run the attached Java source. If you pass it two arguments the
> > > >>first being the path to the map file and the second the string to
> > > >>search for and you pass
> > > >>"Südliche Weinstraße" as the second it will work!
> > > >>
> > > >>So why does it fail when "Südliche Weinstraße" is inside the Java
> > > >>code? That is a problem that only happens when javac compiles the
> > > >>source: javac translates all characters to unicode and in doing that
> > > >>it gets the german characters wrong.
> > > >>To solve this give javac the following option: -source 1.4
> > > >>
> > > >>For more see this link:
> > >
> > >>http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5046139
> >
> > > >>
> > > >>On 4/13/06, Umberto Nicoletti <umberto.nicoletti at gmail.com> wrote:
> > > >>
> > > >>
> > > >>>This is probably not related only to java mapscript, so please read
> > > >>>
> > > >>>
> > > >on.
> > > >
> > > >
> > > >>So I was wrong...but I'll leave the proof to the reader ;-)
> > > >>
> > > >>Best regards,
> > > >>Umberto
> >
> > > >>
> > > >>
> > > >>
> > > >>>On 3/30/06, Oliver Wesp <wesp at gdv.com> wrote:
> > > >>>
> > > >>>
> > > >>>>Dear List,
> > > >>>>
> > > >>>>I' struggling with queryByAttributes on an attribute field with
> > > >>>>
> > > >>>>
> > > >german
> > > >
> > > >
> > > >>>>umlauts using java mapscript.
> > > >>>>The odd thing is that the same thing works fine with php mapscript
> > > >>>>
> > > >>>>
> > > >and
> > > >
> > > >
> > > >>>>when I use expressions in my  mapfile. I'm using a shapefile as
> > > >>>>
> > > >>>>
> > > >>datasource.
> >
> > > >>
> > > >>
> > > >>>Could someone of the other mapserver developers shed some light on
> > > >>>
> > > >>>
> > > >>this issue?
> >
> > > >>
> > > >>
> > > >>>I have a clue to give: php mapscript is using a different regex
> > > >>>library and this explains why the match does not happen for Java
> > > >>>mapscript, while it does happen in php mapscript. If I am right also
> > > >>>the mapserver cgi should be affected and possibly all other mapscript
> > > >>>too.
> > > >>>
> > > >>>It would be very interesting if someone could report on similar
> > > >>>experiences with the cgi-bin version of mapserver.
> > > >>>
> > > >>>Thanks,
> > > >>>Umberto
> > > >>>
> > > >>>
> > > >>>
> > > >>>>Here is what I do:
> > > >>>>
> > > >>>>layer.queryByAttributes(map,"KREIS_NAME", "/Südliche Weinstraße/",
> >
> > > >>>>mapscriptConstants.MS_MULTIPLE);
> > > >>>>layer.open();
> > > >>>>System.out.println( "Result Count: " +layer.getNumResults() );
> > > >>>>layer.close();
> > > >>>>
> > > >>>>The result is always null while replacing the qstring with something
> > > >>>>that doesn't contain special characters (e.g.
> > > >>>>'Mainz-Bingen') works fine.
> > > >>>>
> > > >>>>As noted above the following layer definition in a mapfile works
> > > >>>>
> > > >>>>
> > > >fine
> > > >
> > > >
> > > >>>>LAYER
> > > >>>>     NAME kreis
> > > >>>>     STATUS DEFAULT
> > > >>>>     TYPE polygon
> > > >>>>     DATA "/tmp/subset"
> > > >>>>     TEMPLATE "kreis.html"
> > > >>>>     CLASSITEM KREIS_NAME
> > > >>>>     CLASS
> > > >>>>       NAME Boundary
> > > >>>>       COLOR 128 128 0
> > > >>>>       OUTLINECOLOR 0 0 0
> > > >>>>       EXPRESSION /Südliche Weinstraße/
> >
> > > >>>>     END
> > > >>>>END
> > > >>>>
> > > >>>>
> > > >>>>but this does not:
> > > >>>>
> > > >>>>layer.setClassitem("KREIS_NAME");
> > > >>>>classObj cl = new classObj(layer);
> > > >>>>cl.setName("Classname");
> > > >>>>cl.setExpression("/Südliche Weinstraße/");
> >
> > > >>>>
> > > >>>>I use Mapserver 4.8.1 on W2k, Tomcat 5.0.28.
> > > >>>>
> > > >>>>I can provide some sample data, just in case someone likes to
> > > >>>>
> > > >>>>
> > > >reproduce.
> > > >
> > > >
> > > >>>>Any help is appreciated.
> > > >>>>
> > > >>>>best regards
> > > >>>>Oliver
> > > >>>>--
> > > >>>>Dipl.-Geogr. Oliver Wesp
> > > >>>>Gesellschaft fuer geografische Datenverarbeitung
> > > >>>>Binger Strasse 49-51
> > > >>>>D-55218 Ingelheim
> > > >>>>fon: +49 6132 714818
> > > >>>>fax: +49 6132 714828
> > > >>>>http: www.gdv.com
> > > >>>>
> > > >>>>
> > > >>>>
> > > >
> > > >
> > > >
> > >
> > >
> >
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mio-subset.dbf
Type: application/octet-stream
Size: 353 bytes
Desc: not available
Url : http://lists.osgeo.org/pipermail/mapserver-users/attachments/20060421/b3131afb/mio-subset.obj


More information about the mapserver-users mailing list