[pdal] questions about iterator and buffer semantics

Howard Butler hobu.inc at gmail.com
Mon Jul 1 06:22:20 PDT 2013


On Jun 30, 2013, at 7:54 PM, Chris Foster <chris.foster at roames.com.au> wrote:

> On 29 June 2013 16:16, Chris Foster <chris.foster at roames.com.au> wrote:
>> I've submitted a pull request which fixes the problem
>> (https://github.com/PDAL/PDAL/pull/146)

I want to note that the mosaic filter has not had much exercise, and there are very likely situations where it doesn't quite behave how we'd want. PDAL is still very much a work in progress, and there are plenty of pointy bits all about. I welcome any help you wish to contribute to file down those pointy bits into something less dangerous. While we/I try to make PDAL generic as possible, my group uses it for some very specific translation operations, and these are going to be more optimized and understood than many of the other possible combinations that are possible.

That said, I would welcome any and all support you might be able to give. Feel free to break stuff (a little bit), and question things. The answer might simply be because I got it to work and didn't think hard enough about it after that :)

Thanks to Mateusz, we now have Travis builds for PDAL at https://travis-ci.org/PDAL/PDAL We had a jenkins instance before, but this is better integrated and does pull requests. 


> On further testing, I've found that disabling the dimension caching between
> calls to read() does have a measurable performance impact in some cases.  When
> reading an uncompressed las file in small chunks (buffer size of 1000) the
> patch above causes a 50% performance degradation or so.

One thing I want implement is a mechanism for a Stage to advertise whether changes the PointBuffer's schema. This would allow pipelines that are all simple reads to cache their dimension positions once.


> This is a bit surprising as it indicates the cost of dimension lookup in the
> schema is rather large, unless there's something else I'm missing.  I'll have
> to do some additional performance testing to try to get to the bottom of this.

The cost of dimension lookup per-point would be prohibitive. It is essentially a std::map (actually boost::multi_array) traverse to find a dimension. 



More information about the pdal mailing list