[gdal-dev] Re: JAVA API - Performance
Even Rouault
even.rouault at mines-paris.org
Sat Nov 14 17:32:28 EST 2009
Selon Ivan <ivan.lucena at pmldnet.com>:
I've commited new API that adds ReadRaster() and WriteRaster() methods that use
the regular Java arrays (byte[], short[], int[], float[], double[]). See
http://gdal.org/java
On my PC,
http://trac.osgeo.org/gdal/browser/trunk/gdal/swig/java/apps/GDALTestIO.java
runs in about 20.3s for the ReadRaster()/WriteRaster() case, and in about 24.7s
for the ReadRaster_Direct()/WriteRaster_Direct() case. Not a big advantage
(which tends to not any advantage at all when run with the -server flag, as both
run in about 21.3 s !), but regular Java arrays are a bit easier to use than
ByteBuffer (especially that with Sun JVM 1.6, the array() method on ByteBuffer
is not implemented).
> Even,
>
> You are right. The point is how to take full advantage of the GDAL Java API
> choosing the right
> approach to deal with the raster buffer on the client side.
>
> Best regards,
>
> Ivan
>
> Even Rouault wrote:
> > Selon Ivan <ivan.lucena at pmldnet.com>:
> >
> > Ivan,
> >
> > I'm not sure what you are really measuring if you compare a C++ code versus
> its
> > translation to Java code. I think it just reflects the known slowdown of
> Java
> > when doing intensive computations in comparison to native code. The 0.2
> second
> > difference between the regular array version and the ByteBuffer one is the
> > interesting result, not the 1.2/1.0 second difference between C++ and Java.
> >
> >> Caio Simone,
> >>
> >> I just downloaded imageio-ext to check how it does that but it looks like
> I
> >> don't need to do that now, I can take you report instead. Thank you very
> >> much. I will take a look on array pinning for a start.
> >>
> >> I translated the GDAL Proximity [1] code to Java and I timed both of then
> >> with the same input, a 1024x1024 byte image with just one pixel as feature
> at
> >> the center of the image.
> >>
> >> It took 0.3 seconds in C++ and 1.5 seconds in Java!
> >>
> >> I then translated the buffers to regular arrays and it went down a little
> >> bit, 1.3 seconds.
> >>
> >> It is still a big disadvantage. I believe that the buffer-to-buffer
> >> translation is the guilt time waster in that case.
> >>
> >> [1] http://trac.osgeo.org/gdal/browser/trunk/gdal/alg/gdalproximity.cpp
> >>
> >> My best regards,
> >>
> >> Ivan
> >>
> >>> -------Original Message-------
> >>> From: Simone Giannecchini <simone.giannecchini at geo-solutions.it>
> >>> Subject: Re: [gdal-dev] Re: JAVA API - Performance
> >>> Sent: Nov 10 '09 12:36
> >>>
> >>> Ciao Even,
> >>> just wanted to add my 2 cents.
> >>>
> >>> As you know for the imageio-ext project we have been using the
> >>> GDAL-JNI bindings (actually a modified version of them) for a while in
> >>> order to allow Java users to leverage on GDAL using the ImageIO
> >>> framework which standard in Java.
> >>> This way we also enabled GeoTools and GeoServer to use GDAL as a
> >> datasource.
> >>> In the past I have done quite some performance tests to add some
> >>> new/different methods to them and I can summarise our findings as
> >>> follows:
> >>>
> >>> - DirectByteBuffer vs regular arrays -
> >>> DBB is expensive to allocate but prevent the VM from performing copies
> >>> when having to move data to and from java and native code since they
> >>> live on the native space not on the java heap; On the other side the
> >>> regular arrays are fast to allocate but they are "usually" copied when
> >>> moved across from/to java and native code since the JVM cannot leave
> >>> the native code mess with the java heap space since the garbage
> >>> collector would not be very happy about that. I said "usually" since
> >>> there is a technique called array pinning that we can suggest the JVM
> >>> to use to avoid the copy of regular array; however this mechanism is
> >>> not guaranteed to be implemented and/or to work on each call (same
> >>> reason as above, GC is not happy about this technique).
> >>>
> >>> If you can pool the DBB and/or use a few large DBB, where the cost of
> >>> the copy would overcome the cost of its creation then DBB are much
> >>> better than regular arrays. As an instance I noticed that using when
> >>> reading striped tiff files regular arrays where faster, but as the
> >>> tile size increases (and therefore the cost of a copy overcomes the
> >>> cost of a DBB creation) the DBB performs much better
> >>>
> >>> - DirectByteBuffer and the impact on some JVM -
> >>> Now in the past we decided to stick with DBB and give
> >>> GeoServer/GeoTools users the capability to retile data on the fly.
> >>> However lately, during the WMS performance shootout we noticed on some
> >>> linux machines JVm soldi crashed, not nice (means restarting the
> >>> GeoServer!!!).
> >>> We investigated a bit in depth and the problem was that somehow the
> >>> JVM was failing to allocate some internal images during the rendering
> >>> process and then dying with a NullPointerException (apparently the SUN
> >>> Java2D engineers did not use to check for out of memory errors in the
> >>> java native space). Well, what happens is that if you use too much of
> >>> the Java native space for your own objects, it is likely that the JVM
> >>> itself will start to malfunction (you can find articles on the web on
> >>> the memory model of a Java process, I don't think I am good enough to
> >>> explain it ) since it cannot allocate its own objects.
> >>>
> >>> In the end we decide to leave DBB and go back to regular arrays with
> >>> array pinning. This ensured us robustness and we did not see much
> >>> performance degradation (which means that array pinning in the end
> >>> works). This has been implemented by modifying the SWIG bindings for
> >>> GDAL in order to use a byte array instead of a DBB and then use
> >>> ByteArray utils to convert between different native type (short, int,
> >>> etc..).
> >>>
> >>> - Conclusion -
> >>> We might want to spend some time in the mid term to contribute some of
> >>> this work back (or probably provide funding), but anyway, it would be
> >>> great to have the capability to switch between DBB and regular arrays
> >>> since both have flaws.
> >>> However atm if I were asked I would say to go with regular arrays as
> >>> we do in the imageio-ext project.
> >>>
> >>> Ciao,
> >>> Simone.
> >>> -------------------------------------------------------
> >>> Ing. Simone Giannecchini
> >>> GeoSolutions S.A.S.
> >>> Founder - Software Engineer
> >>> Via Carignoni 51
> >>> 55041 Camaiore (LU)
> >>> Italy
> >>>
> >>> phone: +39 0584983027
> >>> fax: +39 0584983027
> >>> mob: +39 333 8128928
> >>>
> >>>
> >>> http://www.geo-solutions.it
> >>> http://geo-solutions.blogspot.com/
> >>> http://simboss.blogspot.com/
> >>> http://www.linkedin.com/in/simonegiannecchini
> >>>
> >>> -------------------------------------------------------
> >>>
> >>>
> >>>
> >>> On Tue, Nov 10, 2009 at 12:00 PM, Even Rouault
> >>> <even.rouault at mines-paris.org> wrote:
> >>> > Selon Ivan <ivan.lucena at pmldnet.com>:
> >>> >
> >>> > Ivan,
> >>> >
> >>> > thanks for your testing (CC'ing the list as it is of general
> interest).
> >>> > Actually, I also read on some sites that using ByteBuffer object
> versus
> >> regular
> >>> > Java arrays is not always a win. Plus the fact that we must use a
> direct
> >> buffer
> >>> > that has an extra allocation cost according to the Javadoc. So
> >> ByteBuffer might
> >>> > be interesting if you just want to pass big arrays between native
> code,
> >> for
> >>> > example if you read an array from a dataset and then write it to
> another
> >> one
> >>> > without accessing it from the Java side. When you mention that
> accessing
> >> through
> >>> > the byte[] array was faster, did you get it with the array() method
> >> instead ?
> >>> > I'm wondering what the performance overhead of this call is.
> >>> >
> >>> > As ByteBuffer is not at all a requirement for the interface with the
> >> native
> >>> > code, it would be technically possible to add an alternative API that
> >> would use
> >>> > the regular Java array types.
> >>> >
> >>> > Would you mind opening an enhancement ticket about that ? Thanks
> >>> >
> >>> > Even
> >>> >
> >>> >> Even,
> >>> >>
> >>> >> I did some test with the GDAL Java API and some simple raster
> >> operations
> >>> >> like the GDAL Proximity algorthm and I noticed that the performance
> >> while
> >>> >> accessing pixels with <type>Buffer.get(i), <type>Buffer.put(i,value)
> is
> >> not
> >>> >> as good as if you copy then to (or from) a "regular" array, like
> >> float[],
> >>> >> double[], integer[] and byte[].
> >>> >>
> >>> >> The reason for that is obvious, get() and put() are funtion calls and
> >>> >> contains a lot of code for range check.
> >>> >>
> >>> >> If I understand it correctly, ByteBuffer is the ideal or maybe the
> only
> >>> >> way to get access to Buffers from C libraries thought a Java wrapper.
> >> But
> >>> >> do you it would be possible to incapsulate the buffer conversion at
> the
> >>> >> wrapper code so that users would be able to read and write direct to
> >>> >> regular Java arrays?
> >>> >>
> >>> >> Just a suggestion,
> >>> >>
> >>> >> Ivan
> >>> >>
> >>> >
> >>> >
> >>> > _______________________________________________
> >>> > gdal-dev mailing list
> >>> > gdal-dev at lists.osgeo.org
> >>> > http://lists.osgeo.org/mailman/listinfo/gdal-dev
> >>> >
> >>>
> >
> >
> >
>
>
More information about the gdal-dev
mailing list