[gdal-dev] Open(), OpenShared(), errors, FlushCache(), and no Close() ?

Michal Migurski mike at stamen.com
Fri Mar 18 18:59:00 EDT 2011


Thanks Even, very helpful!

Gunicorn is not multi-thread, but it's multi-process, so there's going to be concurrent connections to a data set even though I'm not performing any threaded functions. I'll try what you suggest, dropping the object reference to see what happens.

-mike.

On Mar 18, 2011, at 3:14 PM, Even Rouault wrote:

> Michal,
> 
> For a reason I'm unclear (might be just historical and not desired behaviour 
> ?), the VRT driver will try to rewrite the VRT if it has been modified.
> 
> There's however a workaround to avoid the error to pop at the closing. You can 
> empty the description of the dataset with source_ds.SetDescription('')
> 
> Open() or OpenShared() will not change anything about that.
> 
> In python, you close a dataset by dropping the reference to the object, for 
> example by assigning None to it.
> 
> I'm not clear why you have errors with your new webserver, but if you use a 
> multi-threaded one, did you make sure you have built GDAL with thread support 
> (./configure --with-threads)  ? (This is now the default since GDAL 1.8.0)
> 
> Best regards,
> 
> Even
> 
>> Hi,
>> 
>> I'm seeing some weird behaviors related to virtual raster datasets opened
>> simultaneously from multiple processes. I hope I can explain so that this
>> makes sense. Here's an excerpt of my python code:
>> 
>> 	http://dpaste.com/hold/515217/
>> 
>> Line 8 is where I make a change to the dataset:
>> 
>> 	source_ds.SetProjection(source_ds.GetGCPProjection())
>> 
>> I do that so that the projection for the ground control points is available
>> for a later call to gdal.ReprojectImage(); it wasn't working until I
>> started to use SetProjection() in this way. All of this is being called
>> from the context of a multi-process web server, running as unprivileged
>> user "www-data" under Ubuntu (this is important later). My web server
>> error log fills up with these:
>> 
>> 	ERROR 1: Failed to write .vrt file in FlushCache().
>> 
>> My assumption here is that because the unprivileged user can't write to the
>> dataset file, gdal throws off an error to complain that it can't flush the
>> dataset cache back to the original file. So far, this is just an
>> annoyance, but one that I would expect to go away when I switched from
>> gdal.Open() to gdal.OpenShared() with the read-only flag, like this:
>> 
>> 	gdal.OpenShared(src_path, gdal.GA_ReadOnly)
>> 
>> Still getting the errors.
>> 
>> Meanwhile, I made a switch in web servers, from an Apache-based CGI
>> environment to the multi-worker WSGI server Gunicorn. When I initially ran
>> my code under Gunicorn using my normal, privileged user account, I
>> immediately started to see failures from gdal.Open and gdal.OpenShared,
>> specifically the assertion errors on line 4 of the dpaste above. I tried
>> to place exclusive file locks (using fcntl.flock) around each access to a
>> given VRT dataset, but this didn't seem to help at all. There were
>> frequent, unpredictable errors with opening data sets in a multi-process
>> environment *until* I switched from the privileged user to the
>> unprivileged user. Once I did that, everything began to work normally, but
>> I got all the old "ERROR 1" reports again.
>> 
>> It seems to me that gdal.OpenShared() with the read-only flag isn't doing
>> what it promises, and that it's trying to write back to the files,
>> potentially modifying them even as competing processes are accessing them.
>> Is it possible that the overlapping processes in my privileged user
>> scenario are seeing temporarily-empty VRT files? I'm also confused by the
>> lack of a gdal.Close() function or something similar, and by the fact that
>> I can't seem to make a change to a dataset in memory without gdal
>> attempting to push that change back to disk via FlushCache().
>> 
>> What's the right thing to do here? Make temporary copies of small VRT data
>> sets prior to each use so they can be safely written to and disposed of?
>> Build a wrapper class that encapsulates copying and disposal? Figure out
>> some way to make gdal release datasets when asked, or open them in real
>> read-only mode?
>> 
>> Any advice greatly appreciated!
>> 
>> -mike.
>> 
>> ----------------------------------------------------------------
>> michal migurski- mike at stamen.com
>>                 415.558.1610
>> 
>> 
>> 
>> _______________________________________________
>> gdal-dev mailing list
>> gdal-dev at lists.osgeo.org
>> http://lists.osgeo.org/mailman/listinfo/gdal-dev
> 

----------------------------------------------------------------
michal migurski- mike at stamen.com
                 415.558.1610





More information about the gdal-dev mailing list