[Liblas-devel] Indexing for point clouds

Howard Butler hobu.inc at gmail.com
Tue Apr 21 16:28:09 EDT 2009


I have been investigating and prototyping linking spatialindex with  
libLAS, but I have a few questions of the practicality of this  
approach.  LiDAR point cloud data are typically very large (millions  
to 100s of millions of points), contain attribute-like data such as  
echo intensity, and are somewhat spatially sequential.

By spatially sequential, I mean that points near each other as they  
are stored in the sequential LAS file format are frequently near each  
other spatially (but not always).  Many coders take advantage of this  
property of the data by skipping data as they read it in -- naturally  
thinning the volume for a small accuracy penalty.  This is fine for  
visualization, but in a warehousing scenario, tossing out data is a  
big no-no.  But we also want a warehouse that can be (somewhat) fast,  
especially for spatial queries :)

A first naive attempt to use the DiskStorageManager to store an index  
in parallel with a 32 million point (90mb) LAS file resulted in some  
disappointing results.  I used the default (70%) fill factor, 10 for  
the index and leaf capacity, and star for the rtree variant.  Nearly  
4.5 hours of insertion later (I used a very small pagesize of 3, which  
is the reason for the slow insert, so I could get a feel for the  
minimum size required to make an index), the index of my 90mb, 32  
million point LAS file turned into 517mb of .idx + and 190mb of .dat  
file.  This isn't going to be practical.

I need a plan B.  Anyone have any ideas how to tackle this problem in  
a way that balances space and time efficiency (though I'm not quite  
sure what defines "balance" in this instance) for insert and query?   
Should I start pursuing something like a KDB tree?  Should I figure  
out some sort of stripping/patching/tiling algorithm for the points to  
lessen the indexing load (this tends to be the standard approach to  
this problem)?  Work out a different StorageManager approach that is a  
clustered index of spatially clustered points?

Howard

PS sending to both lists because I hope to include spatial indexing  
functionality in libLAS if I can find a reasonable approach, and if  
any of the LiDAR folks have this one licked and would be willing to  
share, I'd be happy to take it.



More information about the Liblas-devel mailing list