[pdal] Does Entwine support distributed builds?

Connor Manning connor at hobu.co
Thu Jun 13 08:39:13 PDT 2019


Correct - that is not possible.

On Thu, Jun 13, 2019 at 10:16 AM Piero Toffanin <pt at masseranolabs.com>
wrote:

> Hey Connor,
>
> thanks for the reply. I have looked at the subset option and I think it
> would work well for the case where I have already computed all the models.
> For example if I have a folder with:
>
> 1.las
> 2.las
> ...
>
> Then I could spin four machines and do:
>
> 1] entwine build -i 1.las 2.las --subset 1 4 -o out1
> 2] entwine build -i 1.las 2.las --subset 2 4 -o out2
> 3] entwine build -i 1.las 2.las --subset 3 4 -o out3
> 4] entwine build -i 1.las 2.las --subset 4 4 -o out4
>
> Then merge the results. I've noticed two things with this. It seemed that
> as the number of input files increased, the memory and time required to
> create each subset seemed increased also (that's why I opted to use scan +
> build --run 1). The second is that I need to wait for all point clouds to
> be available (both 1.las and 2.las need to be available before I can start
> processing them).
>
> I wanted to rule out whether it was possible to do something like (on two
> separate machines):
>
> 1] entwine build -i 1.las -o out1
> 2] entwine build -i 2.las -o out2
>
> And then merge the resulting EPT indexes into a "global" one:
>
> entwine merge -i out1 out2 -o merged
>
> But I don't think it's possible, correct?
>
> -Piero
>
>
> On 6/13/19 10:43 AM, Connor Manning wrote:
>
> The `subset` option lets each iteration of the build run a spatially
> distinct region, which can be trivially merged afterward, which sounds like
> what you're after.  Another option could be to simply use multiple indexes
> - potree can accept multiple input EPT sources, and a PDAL pipeline may
> have multiple EPT readers.
>
> On Thu, Jun 13, 2019 at 6:46 AM Piero Toffanin <pt at masseranolabs.com>
> wrote:
>
> Hi there,
>
> I have a question regarding the usage of Entwine and was hoping somebody
> could help me? The use case is merging point clouds that have been
> generated on different machines. Each of these point clouds is part to the
> same final dataset. Entwine works great with the current workflow:
>
> entwine scan -i a.las b.las ... -o output/
>
> for i in {a, b, ... }
>
>     entwine build -i output/scan.json -o output/ --run 1
>
> The "--run 1" is done to lower the memory usage. On small datasets runtime
> is excellent, but with more models the runtime starts to increase quite a
> bit. I'm looking specifically to see if there are ways to speed the
> generation of the EPT index. In particular, since I generate the various
> LAS files on different machines, I was wondering if there was a way to let
> each machine contribute its part of the index from the individual LAS files
> (such index mapped to a network location) or if a workflow is supported in
> which each machine can build its own EPT index and then merge all EPT
> indexes into one? I don't think this is possible, but wanted to check.
>
> Thank you for any help,
>
> -Piero
>
>
> _______________________________________________
> pdal mailing list
> pdal at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/pdal
>
> --
>
> *Piero Toffanin*
> Drone Solutions Engineer
>
> masseranolabs.com <https://www.masseranolabs.com>
> piero.dev <https://www.piero.dev>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/pdal/attachments/20190613/dc0e1692/attachment-0001.html>


More information about the pdal mailing list