[GRASS-dev] vector libs: file based spatial index

Hamish hamish_b at yahoo.com
Wed Jun 24 23:36:24 EDT 2009


Moritz wrote:
> The largest file I have used is about 125000 areas with a
> topo file weighing 42M, so taking your worst estimation,
> this would mean around 200MB of spatial index, which is
> still largely acceptable for me.

lidar and swath bathymetry data will easily have millions of points,
and as time goes on this will only expand. I seem to recall that one of
Radim's big disappointments was that the need to handle this technology/
data density only really became apparent just when GRASS's new vector
engine was nearing completion. With some earlier notice it could have
been designed to scale better. Still, there is much tuning which can
be done with the present model to reduce the memory overheads, etc.

FWIW the sites type (now vector points) in GRASS 4/5 scales well, just
as much as you can fit in the text file. (not sure if fseeks are 64bit-
proof there, probably not)

the biggest lidar file used that I know about is Doug's 379GB dataset
(14.5 billion points). The vector engine couldn't handle that* so r.in.xyz
was used. Certainly count on 5 million features with topology and DB table
for dataset sizes /today/.

* I don't know what limitation there is if imported without topology+
DB table.


In future memory, CPU, and HD sizes will only increase, but one thing I've
come to respect is that GRASS's raster modules scale so well today because
they were designed to function in the days of extremely tight memory and
CPU constraints.


you might look at libLAS (for lidar data -- an OSGeo semi-affiliated
project:   http://liblas.org/   It is my understanding that Howard is
currently adding spatial index support in the development version.
You might check out his approach.

I have been, and still am ignorant of what advantage a spatial index
gives you for point data. ... interested to learn why "topology" would
be useful for points-only data.


In general I'm fairly happy with the no-topology solution for lidar
data in grass, but a few targeted modules (eg v.info) really need to
be modified to deal with them.



Hamish


ps- we still need to hunt through the archives for Radim's posts on these
issues which explain quite a bit.



      



More information about the grass-dev mailing list