[pdal] Array Dimensions

Chris Foster chris.foster at roames.com.au
Mon Jul 1 07:54:07 PDT 2013


On 1 July 2013 23:34, Howard Butler <hobu.inc at gmail.com> wrote:
>> Sorry to bombard the list with so much mail, I'll try to hold off a bit after
>> this one.
>
> Not a problem at all. Keep emailing as you explore and find dumb stuff :)

Excellent, I will keep looking around :)

>> After a bit of hacking on the weekend I'm fairly confident that pdal will work
>> for us, with the possible exception of one major feature which appears to be
>> missing from PointBuffer: Dimensions which hold short fixed size arrays rather
>> than single elements.
>
> I hadn't thought of this.
>
> Were you planing to use PDAL for something other than data translation?

I have two uses in mind.  One is to extract chunks of data from a custom point
database which we have at work.  We already have a filtering system but pdal
is much slicker, particularly the pipeline serialization side of things, which
is a real pleasure to behold :)  Ultimately I've been writing sophisticated
filter chains where some stages require building spatial datastructures from
the points.  The filters in my current code aren't composable however, and you
can't add custom dimensions to communicate between stages.  I'm hoping that
using the pdal API will solve those problems.  It seems to me that requiring
spatial context in a filter goes a bit beyond what pdal was designed for, but
given you do everything in chunks I'm hoping it's possible without too much
effort.

The other use is as a reader for various point cloud formats (primarily .las
and .laz) for my point cloud viewer displaz (https://github.com/c42f/displaz).
I integrated initial (very basic) support for pdal on the weekend.  Displaz is
meant to be a swiss army knife for geospatial point cloud visualization, so it
needs the ability to read and display arbitrary point dimensions, and good
metadata support.  Currently I have a quick and dirty PointArray to hold the
data in memory, but I'd like to replace this with pdal's PointBuffer if
possible.  I really want flexible visualization of arbitrary point dimensions
whose names and types are not known at compile time.

>> The problem is that sometimes a Dimension logically contains more than one
>> element, and it doesn't make sense to name the elements independently.  For
>> example, suppose I want to write a filter which computes local geometric
>> properties using the eigenstructure of the local covariance matrix.  The
>> output from this is naturally represented as a 3-vector of eigenvalues and a
>> 3x3 matrix of eigenvectors per point.  Now, I could name each of these using
>> their own Dimension, but that introduces unnecessary overhead and the
>> resulting components don't make logical sense on their own.
>>
>
> You could kind of do this now with a dimension type
> pdal::dimension::UnsignedByte and an explicit size. Nothing useful other
> than yourself would be able to interpret it though.

Right.  Question: what's the distinction between UnsignedByte and
UnsignedInteger?  It's confusing to me to have both of these.

>> I'm certainly willing to write code to make this work, but I'm not sure how
>> large a job it would be, so I'd need some guidance about whether it's a
>> desired feature before I start anything.  If not, I may have to implement a
>> point buffer class of my own, but I like to avoid reinventing wheels where
>> possible.
>
> Thinking out loud, we'd need:
>
> * pdal::dimension::ArrayType

Arrayness seems orthogonal from the element type to me.  In the past I've
flagged arrays simply by having element count > 1.

> * pdal::Dimension::getElementCount() (or appropriate name)

Sounds good.

> * pdal::Dimension::getByteSize() needs to be made dynamic when the dimension type is array

I assume this doesn't come with a performance hit?  I'm just wondering if
there's code somewhere which habitually checks the byte size in a tight loop,
expecting it to be a simple memory access.

> * getField/setField *should* just work as-is.

Yes I think so, provided it's reinterpreted as a boost::array or some
equivalent aggregate type.  We probably need a special version which returns a
pointer to the first element of the array for those cases where the element
count isn't known at compile time.  Or is there a better option?

Another more wacky thought which crosses my mind is that it would be helpful
to have aliases (eg, "Red", "Blue" and "Green" dimensions alias fields of the
"Color" dimension which is a three element array per point).  This would be
rather handy for me since I actually don't care to read the color components
separately, but it might be too hard to get right.

> Please start a pull request with this effort, and I'll make sure to track
> its development. If you get stuck, don't hesitate to ask or catch me on IRC.

Great, I'll make a pull request once I have something concrete to share.

~Chris


More information about the pdal mailing list