[Tilecache] [SUMMARY] Windows, MetaTiling,
and Disk cache locking
crschmidt at metacarta.com
Fri May 29 06:42:26 EDT 2009
On Thu, May 28, 2009 at 11:45:07PM -0500, Shawn Gervais wrote:
> I was having trouble with metatiling and excessive requests to my
> backend WMS on Windows, and with Chris' help, I tracked it down to the
> locking behavior implemented by the Disk cache. I'll try to summarize
> the problem and solution, in case anyone else runs into the same issue
> My setup: Windows 2003, Apache 2.2.11, mod_fcgid, TileCache 2.10,
> mod_python, Python 2.5, mapserver trunk WMS. Because I'm using
> metatiling I also have PIL installed. My TileCache instance is accessed
> through a WMS Layer in an OpenLayers 2.7 application.
> The problem: I noticed, by looking at the Apache request log, that
> multiple identical requests were being issued by TileCache to MapServer,
> and that this only happened when metatiling was used.
> As a result, my WMS was getting about 3 times the load that it should
> have been.
> The cause: On Windows, the Disk cache was failing to acquire exclusive
> locks properly. It seems a race condition existed, in which multiple
> requests coming in quickly from OL which fell within the same metatile
> boundary, would all acquire the same lock. Then, each would hit the
> backend WMS and request a full metatile render.
So, I think this is what happened:
* We wee having a problem with os.makedirs, where two 'creates'
that shared a parent directory would have one failing, becuase the
os.makedirs isn't atomic.
* We changed the code (r258) to fix this problem, by creating our *own*
makedirs call that was a wrapper... but in that call, we *expicitly
hide* 'directory exists' messages.
So, the problem is that when I added the 'catch diectory exists' cases,
I failed to accomodate for a "Shit! Don't do that!" case, in the case of
I *believe* that the right fix for this is to:
* Stop using directory names that are inside the hierarchy for the
locks. We know that os.makedirs can have races, so let's not use
makedirs: instead use the (really atomic) mkdir.
* Have the attemptLock function use this instead.
An easier alternative is to add a "don't catch the directory exists"
case to makedirs, which I've done now as a monkey patch. This lets us
still bump into the os.makedirs race, so this probably needs more work,
but it'll do as a pinch.
This means that all lockin ghas been broken since r258. I'll try to get
to a fix for this sooner rather than later. In the mean time, your fix
will wokr -- it changes the problem from *always* existing to having a
race condition, but the race window is pretty slim -- probably slimmer
than you'll hit much, practically speaking.
Thanks for digging into this. Definitely a big mistake on my part.
> My workaround: I added an "os.path.exists" before os.makedirs in
> Disk.py, to mimic the expected behavior of os.makedirs alone -- namely,
> that os.makedirs on an existing path should throw OSError. With this
> change, the locking appears to work correctly and only the first request
> for a subtile of a metatile will actually hit the backend WMS.
> Tilecache mailing list
> Tilecache at openlayers.org
More information about the Tilecache