[gdal-dev] Corrupted .shp error when reading shapefile from multiple threads

Martin Chapman chapmanm at pixia.com
Wed Aug 1 21:12:39 PDT 2012


Kedar,

Think about it...how does GetNextFeature() know what feature is next?
It's because somewhere inside the datasource handle there is probably a
variable that is keeping track of what record is next.  If multiple
threads are all using the same datasource handle then this variable is
probably getting updated / read by different threads at the same time and
then all hell will break loose.

The answer to your problem is to open a separate datasource handle for
every thread you have so there is no sharing across threads.   Right now
your getZipCode() function is sharing a global datasource handle and the
code is not synchronized.  If you synchronize this code then it will be no
better than a single threaded application so if I was you I would either:

1. Open a datasource handle on each thread and pass it to the zipcode
function from each thread ... or

2. Create a class that encapsulates the getZipCode() function and has a
datasource member variable that opens the datasource on construction and
then create a separate instance of this class for each thread on startup
so each thread has its own copy of the class.   Or ....

3. See if there is a function on the handle that will get the next feature
given an index (like GetFeature(int index)) that you pass into the
function so there is no internal record keeping needed by the handle.
This is a risky way to do it though because even though it will work there
may be other global variables the handle is keeping so unless Frank can
vouch for that method I would go with curtain number 2. 

Best regards,
Marty



-----Original Message-----
From: gdal-dev-bounces at lists.osgeo.org
[mailto:gdal-dev-bounces at lists.osgeo.org] On Behalf Of kedardeshpande87
Sent: Wednesday, August 01, 2012 6:05 PM
To: gdal-dev at lists.osgeo.org
Subject: [gdal-dev] Corrupted .shp error when reading shapefile from
multiple threads

Hi,

I am writing an application in Java with gdal's java bindings that reads a
zipcode layer shapefile.
I set a spatial filter of a point on the layer and get the zipcode by
reading from the layer.
...
DataSource ds = ogr.Open("path/to/shapefile"); Layer layer =
ds.GetLayer(0); ...
public void getZipCode(double lat, double lon) {
  Geometry g = new Geometry(1); // Passing the geometry type. 1 for point.
  g.AddPoint(lon, lat);
  layer.SetSpatialFilter(g);
  Feature f = null;
  while((f = layer.GetNextFeature()) != null) {
     String zipcode = f.GetFieldAsString(0);   // 0th field is zipcode
  }
}

Above is a snippet of the code I am writing. Now, when call this
getZipCode() function for multiple times (like in a loop), it works fine.
But when I call this function from multiple parallel threads, I am getting
an error saying corrupted .shp file. This error is showing up only when I
make simultaneous reads on the shapefile from many threads.  The error is
as
follows:

ERROR 1: Corrupted .shp file : shape 6, nPoints=677, nParts=1,
nEntitySize=5256.

Sometimes I get an error like this :

# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x000000379fa72faf, pid=26152, tid=1120618816 # #
JRE version: 6.0_32-b05 # Java VM: Java HotSpot(TM) 64-Bit Server VM
(20.7-b02 mixed mode
linux-amd64 compressed oops)
# Problematic frame:
# C  [libc.so.6+0x72faf]  _dl_tls_get_addr_soft@@GLIBC_PRIVATE+0x72faf
#
# An error report file with more information is saved as:
# /rhel5pdi/home/kdeshpan/workspace/GDAL_Java_Cpp/hs_err_pid26152.log
#
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

I really want this to work in a multithreaded environment.

I am not modifying the shapefile, I am only reading the zip code
information from the shapefile. So I am getting why is the shapefile
corrupted.
Moreover, if I use the same shapefile after getting this error and make
synchronous requests, it works. So I am guessing the shapefile is not
actually corrupted and it is showing the error because of something else.
Is this the expected bahavior from OGR library? Does it not support
multithreaded env ?
Or is it that there is some problem with the Java bindings ?
Can someone please suggest me how I can resolve this or get around this
issue ?

Thanks!
Kedar



--
View this message in context:
http://osgeo-org.1560.n6.nabble.com/Corrupted-shp-error-when-reading-shape
file-from-multiple-threads-tp4992566.html
Sent from the GDAL - Dev mailing list archive at Nabble.com.
_______________________________________________
gdal-dev mailing list
gdal-dev at lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


More information about the gdal-dev mailing list