[gdal-dev] Corrupted .shp error when reading shapefile from multiple threads
Martin Chapman
chapmanm at pixia.com
Thu Aug 2 06:56:08 PDT 2012
Even,
Thank you for your input on that. I figured that was a risky idea and it
is good to know the definitive answer.
Best regards,
Martin
-----Original Message-----
From: Even Rouault [mailto:even.rouault at mines-paris.org]
Sent: Thursday, August 02, 2012 2:13 AM
To: Martin Chapman
Cc: 'kedardeshpande87'; gdal-dev at lists.osgeo.org
Subject: Re: [gdal-dev] Corrupted .shp error when reading shapefile from
multiple threads
Selon Martin Chapman <chapmanm at pixia.com>:
> Kedar,
>
> Think about it...how does GetNextFeature() know what feature is next?
> It's because somewhere inside the datasource handle there is probably
> a variable that is keeping track of what record is next. If multiple
> threads are all using the same datasource handle then this variable is
> probably getting updated / read by different threads at the same time
> and then all hell will break loose.
>
> The answer to your problem is to open a separate datasource handle for
> every thread you have so there is no sharing across threads. Right now
> your getZipCode() function is sharing a global datasource handle and
> the code is not synchronized. If you synchronize this code then it
> will be no better than a single threaded application so if I was you I
would either:
>
> 1. Open a datasource handle on each thread and pass it to the zipcode
> function from each thread ... or
>
> 2. Create a class that encapsulates the getZipCode() function and has
> a datasource member variable that opens the datasource on construction
> and then create a separate instance of this class for each thread on
startup
> so each thread has its own copy of the class. Or ....
>
> 3. See if there is a function on the handle that will get the next
> feature given an index (like GetFeature(int index)) that you pass into
> the function so there is no internal record keeping needed by the
handle.
> This is a risky way to do it though because even though it will work
> there may be other global variables the handle is keeping so unless
> Frank can vouch for that method I would go with curtain number 2.
Martin, I concur with your points.
About 3), (using GetFeature(index)), this won't work either. Because
GetFeature() needs to seek in the .shp and .dbf files. So if 2 threads use
GetFeature() on the same layer object, which has a single file pointer,
the seeks may conflict with each other and you might end up reading the
wrong content. The only safe solution is to use a datasource object for
each thread.
More information about the gdal-dev
mailing list