[Liblas-devel] Specification for point dimensionality

Howard Butler hobu.inc at gmail.com
Thu Feb 25 11:10:39 EST 2010


All,

libLAS was recently updated to allow a user to add extra data to each point.  I think this is technically within the bounds of the spec, and there have been a few files in the wild that are out there that do this (none that have let me put them on the public samples repository, however).  Essentially what this means is that the point format, which is specified 0...4 (or 0...5 for 1.3 but we're not doing that right now) has a fixed width in bytes with specified dimensions of data stored (X, Y, Z, R, G, B, T, I, etc).  What LASPoint::{Get|Set}ExtraData allows is for you to provide a byte array to tag on to the point that is beyond what the header's point format specified.  For example, if the point format were 0 in the header, it would have a nominal width of 20 bytes and it would contain the X, Y, Z, etc and everything in the base format type.  If a libLAS user set the point format to 0 and the data record length to 40 in the header, they would have 20 bytes in which they could use {Get|Set}ExtraData to store anything they want per point.

This development begs the question of how to tackle describing extra dimensionality.  The LAS specification is deficient in that it prescribes mandatory items have bytes provided for them even when they are not filled with actual data.  This means extra bloat in the format and a developer must pan through the data to determine statistics about it.  What would be nice is if there were header information to describe the dimensions that exist in the file, whether they are used or not, and what their size(s) might be.

The Oracle Point Cloud work that I am currently working on highlights this issue even more.  OPC allows you to store up to 12 dimensions on the point data (in aligned 8 byte BLOB form) but provides no way regularized way to describe the dimensions used.  I would like to propose that we provide a liblas.org VLR record that contains an XML file to describe the dimensions, their sizes, if they are used, etc.  libLAS (and Oracle Point Cloud) would then be updated to provide support for interpreting this information, but an unaware reader should be able to work without knowing how to interpret it.  

I propose each entry in the file have the following attributes:

* Name
* Description
* Position
* Size (in type)
* Type (bits, bytes)
* Data interpretation type (integer, double, float, etc)

Additionally, I think it should be possible to nest entries.  For example, we should have something like:

<Dimension name="Sensor Attributes" size="1" type="byte" position="4">
 <Dimension name="Return Number" size="3" type="bit" position="0" />
 <Dimension name="Number of Returns" size="3" type="bit" position="1" />
 <Dimension name="Scan Direction" size="1" type="bit" position="2" />
 <Dimension name="Edge of Flight Line" size="1" type="bit" position="3" />
</Dimension >
<Dimension name="Classification" size="1" type="byte" position="5" interpretation="uchar" />
...

All this would be properly namespaced XML (liblas.org or something) along with whatever we can find for standards such as those for describing data types, etc.  Maybe there's an existing standard for something like this already, I don't know.

What do you think?

Howard



More information about the Liblas-devel mailing list