[GRASS-dev] vector libs: file based spatial index
Markus GRASS
markus.metz.giswork at googlemail.com
Wed Jun 24 17:05:58 EDT 2009
Paul Kelly wrote:
> On Tue, 23 Jun 2009, Markus GRASS wrote:
>
>> My implementation is completely file based, also when creating or
>> updating a vector. This comes obviously with a speed penalty because
>> reading in memory is faster than reading from file. With all sorts of
>> tricks and relying on the system to cache files, the file based spatial
>
> Hi Markus, The last point above sounds interesting. What sort of tricks?
When inserting or deleting or searching, the search patch is remembered
(kept in memory) and walked back by using a stack in a non-recursive
functions. This reduces file IO operations about by half and should be a
faster algorithm than the recursive alternative. As an example,
inserting a new item into an RTree that already has about 50,000 items
requires a total of 3 file reads and 3 - 5 file writes, each file IO
reads or writes 528 bytes on a 32 bit system and 586 bytes on a 64 bit
system. And when using intermediate files, the whole chunk of 528/568
bytes is read or written at once.
> Are you using memory-mapped files?
What are memory-mapped files? Excuse my ignorance, I'm just a
self-trained coder (learning by doing).
> My thinking would be to always read it from the file, and rely on the
> operating system as best as possible to not slow down performance too
> much (compared to holding the index in memory) for small sizes of
> indexes. I think that would be a seamless foolproof (from the user's
> point of view) approach that would scale very well with both
> increasing size of datasets and performance of computer hardware.
It would also require less complex code, it's already quite a bit more
complex by now.
Markus M
More information about the grass-dev
mailing list