<div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Aug 21, 2014 at 11:36 AM, Jeff Lacoste <span dir="ltr"><<a href="mailto:jefflacostegdal@gmail.com" target="_blank">jefflacostegdal@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr">Hi,<div><br></div><div>Improving the thread safety of GDAL is a big improvement. I know this proposal is not claiming to 'fix' gdal thread safety but adress at least the cache safety. This is said, may be to help</div>
<div>clarify more the proposal, we can state what the change would address and (may be more important) not address. Just to make it more specific about gain of caching per dataset instead of global one. </div></div></blockquote>
<div><br></div><div> Updated the RFC to hopefully reflect this better reflect some of this, please send more questions as you have them. </div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div dir="ltr"><div></div>
<div>Would this mean for ex. that batch translating datasets would gain from this and be done in a parallel way ? Since we can avoid avoid trashing a global trash.</div><div>So instead of translation x datasets in a sequential manner now (with the proposed changes) this can be done in parallel ?</div>
</div></blockquote><div><br></div><div>Currently it is possible to do batch translating of datasets, but it is not efficient. The reason for this is that there is a point where each thread is attempting to add or remove gdalrasterblocks from the global cache. Once the global cache size is reached all the threads are often blocked for extended periods as each one attempts to clear the cache. The worst part is that this also prevents simple reading from the cache during this period, because other blocks can not efficiently use GDALRasterBand::TryGetLockedBlockRef, to pull existing blocks from the cache.</div>
<div><br></div><div>The simple and first fix I made while doing this was simply to make it possible to operate with a per dataset cache (this was not selected as per band due to possible deadlock issues). Doing this allows for each dataset to have its own lock for its cache and the performance increases dramatically for operating on different datasets in parallel. </div>
<div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div dir="ltr">
<div><br></div><div>If yes, what would the cache flushing strategy once the cach max is reached ? For ex. 5 running threads converting 5 datasets. We reach the max cache while the 5 threads are executing, this mean that threads will be blocked from executing as no cache is available until it has been released by other threads ?</div>
<div></div></div></blockquote><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr">
<div>Jeff Lacoste</div></div></blockquote><div><br></div><div>In the case I described above each dataset has its own max cache. Therefore, reaching the maximum cache can only occur in a single dataset. Therefore if the maximum cache was reached on that cache, it would flush and not prevent other datasets and threads from operating on their own caches. However, this means that the person writing the code MUST be aware of the total cache sizes of all their datasets (I know this isn't ideal for many people). </div>
<div><br></div><div>So I decided to take it a step farther and wanted to make it possible for a dataset using a driver such as memory to be able to do something such as: 5 running threads translating 1 dataset, 5 times. Also I wanted to be able to use one global cache and make it so that, the 5 threads translating 5 datasets would not lock as often with a global cache. To achieve this a separation of concerns had to occur, and this lead to the development of multiple mutexes. </div>
<div><br></div><div>Three different data structure are at risk during threading within the cache:<br><br>#1 - The Linked List of the cache and size of this linked list (this allows the cache to flush the least recently used GDALRasterBlock)</div>
<div>#2 - The cache block array of a GDALRasterBand (this allows the GDALRasterBand to find its GDALRasterBlocks)</div><div>#3 - The data stored within the GDALRasterBlock (this is the actual data stored in the block)</div>
<div><br></div><div>If you are limiting the scope of our support of threading to only allow 1 thread to access any 1 dataset at a time #3 requires no protection. However, in any cache that is shared by more then 1 dataset (global cache), not only must the linked list be protected by the list global cache, but you must protect the cache block array in the raster bands. Since currently when items are removed from the cache, it simply removes blocks until it is below the cache limit, all raster bands can not read from their cache block arrays until all flushing has completed. </div>
<div><br></div><div>Therefore, during my design I decided that each portion should be protected by its own mutex. In order to not require the linked list (LRU) mutex from being locked for extended periods, I mark blocks for deletion when it is to removed from the cache and it will be removed at the earliest safe period, which unless another the block is about to be used by another thread, will be right away. Otherwise it will occur once the other thread is done using that block.</div>
<div><br></div><div>Therefore, this also means the mutex protecting the global cache in my code is also locked less. Therefore, even with out using a per dataset cache, the example of using 5 threads to translate 5 datasets should still be faster then the current configuration.</div>
<div><br></div><div>TLDR; There are benefits in my design outside of just a non global cache for datasets.</div><div><br></div><div>Blake Thompson</div></div><br></div></div>