[postgis-users] Massive Lidar Dataset Datatype Suggestions?

collin collin at socrates.berkeley.edu
Sat Nov 13 12:34:20 PST 2004


Greetings all,

I asked some questions about massive point datasets in April, before the 
implementation of LWGEOM. Then I fell off the face of the earth for a 
while.  Know I'm back working on this issue again and would love some input.

I have a LIDAR dataset of 473 million points with 2 point geometries 
(first & last returns), timestamp and 2 laser intensity returns (integer).

I am trying to figure the best setup for storing, extracting and 
processing this dataset.  btw, it is a smallish dataset. We will be 
processing 2 billion+ point projects in the near future.

Currently I compiled & installed postgresql 8.0 beta 4, with postgis 0.9 
release, geos 2.0.1, and proj 4.  This is on Fedora Core 2, smp 733mhz, 
1GB ram, 160gb hdd Intellistation.

I am using HWGEOM, with WKT right now and managed to create a table with 
  oid & the_geom for one point per return.
The upload took 48 hours and is roughly 85GB.
The GiST indexing took ~80 hours and is ~35GB.

This is obviously non-optimal considering we now have WKB and LWGEOM to 
play with.  I couldn't get LWGEOM to install properly from the cvs 
extract, which is why I reverted to the 0.9 version.

So, any suggestions on how to get the full 9-column dataset uploaded 
with a more efficient data structure?  (note: current machine is just a 
test machine. Production will have a LOT more drivespace).

Also, I intend to perform a fair amount of point processing inside the 
database using either plpgsql or java api.  Is this a bad idea?

Thanks for any input.

________________
Collin Bode
GIS Informatics Researcher
Power Lab, Integrative Biology
University of California, Berkeley



More information about the postgis-users mailing list