[Liblas-devel] Indexing for point clouds
Howard Butler
hobu.inc at gmail.com
Tue Apr 21 16:28:09 EDT 2009
I have been investigating and prototyping linking spatialindex with
libLAS, but I have a few questions of the practicality of this
approach. LiDAR point cloud data are typically very large (millions
to 100s of millions of points), contain attribute-like data such as
echo intensity, and are somewhat spatially sequential.
By spatially sequential, I mean that points near each other as they
are stored in the sequential LAS file format are frequently near each
other spatially (but not always). Many coders take advantage of this
property of the data by skipping data as they read it in -- naturally
thinning the volume for a small accuracy penalty. This is fine for
visualization, but in a warehousing scenario, tossing out data is a
big no-no. But we also want a warehouse that can be (somewhat) fast,
especially for spatial queries :)
A first naive attempt to use the DiskStorageManager to store an index
in parallel with a 32 million point (90mb) LAS file resulted in some
disappointing results. I used the default (70%) fill factor, 10 for
the index and leaf capacity, and star for the rtree variant. Nearly
4.5 hours of insertion later (I used a very small pagesize of 3, which
is the reason for the slow insert, so I could get a feel for the
minimum size required to make an index), the index of my 90mb, 32
million point LAS file turned into 517mb of .idx + and 190mb of .dat
file. This isn't going to be practical.
I need a plan B. Anyone have any ideas how to tackle this problem in
a way that balances space and time efficiency (though I'm not quite
sure what defines "balance" in this instance) for insert and query?
Should I start pursuing something like a KDB tree? Should I figure
out some sort of stripping/patching/tiling algorithm for the points to
lessen the indexing load (this tends to be the standard approach to
this problem)? Work out a different StorageManager approach that is a
clustered index of spatially clustered points?
Howard
PS sending to both lists because I hope to include spatial indexing
functionality in libLAS if I can find a reasonable approach, and if
any of the LiDAR folks have this one licked and would be willing to
share, I'd be happy to take it.
More information about the Liblas-devel
mailing list