[gdal-dev] Corrupted .shp error when reading shapefile from multiple threads

Martin Chapman chapmanm at pixia.com
Thu Aug 2 06:56:08 PDT 2012


Even,

Thank you for your input on that.  I figured that was a risky idea and it
is good to know the definitive answer.

Best regards,
Martin

-----Original Message-----
From: Even Rouault [mailto:even.rouault at mines-paris.org] 
Sent: Thursday, August 02, 2012 2:13 AM
To: Martin Chapman
Cc: 'kedardeshpande87'; gdal-dev at lists.osgeo.org
Subject: Re: [gdal-dev] Corrupted .shp error when reading shapefile from
multiple threads

Selon Martin Chapman <chapmanm at pixia.com>:

> Kedar,
>
> Think about it...how does GetNextFeature() know what feature is next?
> It's because somewhere inside the datasource handle there is probably 
> a variable that is keeping track of what record is next.  If multiple 
> threads are all using the same datasource handle then this variable is 
> probably getting updated / read by different threads at the same time 
> and then all hell will break loose.
>
> The answer to your problem is to open a separate datasource handle for
> every thread you have so there is no sharing across threads.   Right now
> your getZipCode() function is sharing a global datasource handle and 
> the code is not synchronized.  If you synchronize this code then it 
> will be no better than a single threaded application so if I was you I
would either:
>
> 1. Open a datasource handle on each thread and pass it to the zipcode 
> function from each thread ... or
>
> 2. Create a class that encapsulates the getZipCode() function and has 
> a datasource member variable that opens the datasource on construction 
> and then create a separate instance of this class for each thread on
startup
> so each thread has its own copy of the class.   Or ....
>
> 3. See if there is a function on the handle that will get the next 
> feature given an index (like GetFeature(int index)) that you pass into 
> the function so there is no internal record keeping needed by the
handle.
> This is a risky way to do it though because even though it will work 
> there may be other global variables the handle is keeping so unless 
> Frank can vouch for that method I would go with curtain number 2.

Martin, I concur with your points.

About 3), (using GetFeature(index)), this won't work either. Because
GetFeature() needs to seek in the .shp and .dbf files. So if 2 threads use
GetFeature() on the same layer object, which has a single file pointer,
the seeks may conflict with each other and you might end up reading the
wrong content. The only safe solution is to use a datasource object for
each thread.


More information about the gdal-dev mailing list