[pdal] Change the pdal container

Helimap Postmaster postmaster at helimap.ch
Wed Mar 16 08:12:47 PDT 2016

Comments inline:

On Wed, Mar 16, 2016 at 3:14 PM, Howard Butler <howard at hobu.co> wrote:

> > On Mar 16, 2016, at 7:16 AM, Helimap Postmaster <postmaster at helimap.ch>
> wrote:
> >
> > Hello Andrew,
> >
> > Thanks for digging in the issue! Unfortunately, I am not familiar yet
> with the 'views' and 'tables' terminology... What I need after the read of
> the las is a std::vector<float3> with the x,y,z coordinates of the las'
> points.
> It's not very clear what you are actually asking for. Do you just want all
> of the XYZ data of an LAS file to be in a std::vector<float> after reading?
> If so, just read the data up and then copy it from the PointView using
> getFieldAs<float> and be on your way.

I want to have a huge std::vector<float3> with the xyz information of the
las or a huge std::vector<*SomeLasPointThing*>. three floats, one per
component. Re-copying the millions of points once read seems expensive; I'd
rather store them correctly directly or at least std::move them.

> > I think there has been a little confusion with that. My las file is
> millions of points that I need to colorize (in parallel). At the moment I
> have it working for ascii xyzc to xyzrgb. I want to add las support, and
> therefore I have to read the las cloud, colorize each pixel and save it.
> filters.colorization does exactly this. See
> http://www.pdal.io/stages/filters.colorization.html Running it over 20
> million points should only be seconds. It is actually most sensitive to
> GDAL's raster cache settings more than anything.

It's not a 1to1 colorization. I didn't go into detail: I am projecting the
images using the intrinsic and extrinsic coordinates of the images, which
are geolocalised. Thus, there is some computation involved. My educated
guess is the filter does not do that, given that the camera parameters are
not part of the input.

> I would also note that I think your approach to parallelization is The
> Hard Way. LiDAR and point cloud workflows often start with chopping things
> up into tiles, virtual or real, and then processing those pieces
> individually. The data are frequently at rest ready to go for this
> scenario. While not always ideal for every processing task (how do you
> interpolate over edges?), it's a tried-and-true approach. PDAL right now
> expects that you are doing your fan-out at the process level (run over a
> bunch of tiles) rather than the intra-process one (run parallel over a
> single tile).

Is the tiling really that rellevant here? Since I am colorizing a point
cloud and each point is independent, 'tiling' could also mean grabbing N
points per iteration. There is no blending necessary. Take one point, find
the appropiate image, reproject to the image, get pixel color, save
point+color. I'd like buffered reading for the I/O efficiency.

I haven't brought the tiling to the mix, I didn't deem it rellevant at the
time. But that's not really important because the las clouds usually fit in
memory, the only possible problem being the contiguous memory allocation.

I can comment that on another application we have we do tile. Good to know
that the usual way is paralellize per tile rather than per pixel. My guess
to doing it The Hard Way was related to the amount of points and images
loaded into memory. First, there are less cloud points loaded at once (at
least the rgb cloud can be kept at tile-size), and second if I process
per-pixel, less images will be concurrently loaded, since threads can share
images while coloring. With 15 gb of ram, no more than 30 images should be

In this second application's context, loading and unloading images takes as
much time as coloring the clouds[1], that's why I was a little biased so
that the processed points would be close to each other. But taking into
consideration your comment, it might work just as well paralelizing per
chunk... I would have to think about it, but we might be drifting to
another discussion?



[1] At the moment we aren't being smart on what we load from the image, we
load all of it, there might be some improvement there.

> That's not to say it's impossible to do the latter, but it's just not the
> expected operating mode. As Andrew mentioned, you can implement your own
> storage to do things in whatever fashion you wish.
> Howard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/pdal/attachments/20160316/82268c1f/attachment-0001.html>

More information about the pdal mailing list