[pgpointcloud] Columns and patch schema matching

Mon Dec 16 07:41:03 PST 2013

I am loading a bunch of files, and one of the errors I frequently see but don't completely understand is the following:

> Caught PDAL exception: ERROR:  column pcid (1) and patch pcid (2) are not consistent

I take it this means that the schema of the patch I'm currently loading (from file B) doesn't currently match the schema of the column I'm inserting data into (created by file A, I suppose)?  So the question becomes how do I harmonize my directory of 35,000 files into a single schema that pgpointcloud can love? I see the writer has the ability to override the schema with a "pcid" option, but this points the danger at your toes.

In PDAL, there is a filter called "filters.selector"[1] which has the ability to mark which dimensions in the pipeline's schema should be kept, ignored, or created. It should be the responsibility of the "drivers.pgpointcloud.writer" to make sure to pack() the schema and pack() the data it is writing to remove ignored dimensions. This is the only way I can think of to harmonize the layout of all 35,000 files into the same schema. To that end, I have added some code to PDAL to support this, and updated the pgpointcloud writer to use it. 

In oracle's point cloud storage, each patch has a reference to its own schema rather than the entire table pointing at a single schema (often a bunch of patches pointing at the same one, but that's an implementation detail). This is difference that should be pointed out in documentation and examples more clearly. I will come up with an example that uses the selector filter and provide a pull request to incorporate it with detail on why you should care.

Howard

[1] http://www.pointcloud.org/stages/filters.selector.html