[pdal] Does Entwine support distributed builds?

Piero Toffanin pt at masseranolabs.com
Thu Jun 13 08:56:58 PDT 2019


Thanks, suspected that was the case but wanted to confirm.

In regard to building subsets, is there an advantage to using "entwine 
scan" vs. the input files directly to "entwine build" in terms of 
performance (or is scan a simple utility to simplify finding datasets 
within a folder)?

Are there any tips or tricks that I should be aware of in terms of 
memory usage when building using subset? For example, is it memory 
efficient to do:

entwine build -i 1.las 2.las [...] 399.las 400.las --subset 1 64 -o out1

?

As compared to perhaps running 400 times:

entwine build -i 1.las 2.las [...] 399.las 400.las --subset 1 64 -o out1 
--run 1

?

Sorry for all the questions!

On 6/13/19 11:39 AM, Connor Manning wrote:
> Correct - that is not possible.
>
> On Thu, Jun 13, 2019 at 10:16 AM Piero Toffanin <pt at masseranolabs.com 
> <mailto:pt at masseranolabs.com>> wrote:
>
>     Hey Connor,
>
>     thanks for the reply. I have looked at the subset option and I
>     think it would work well for the case where I have already
>     computed all the models. For example if I have a folder with:
>
>     1.las
>     2.las
>     ...
>
>     Then I could spin four machines and do:
>
>     1] entwine build -i 1.las 2.las --subset 1 4 -o out1
>     2] entwine build -i 1.las 2.las --subset 2 4 -o out2
>     3] entwine build -i 1.las 2.las --subset 3 4 -o out3
>     4] entwine build -i 1.las 2.las --subset 4 4 -o out4
>
>     Then merge the results. I've noticed two things with this. It
>     seemed that as the number of input files increased, the memory and
>     time required to create each subset seemed increased also (that's
>     why I opted to use scan + build --run 1). The second is that I
>     need to wait for all point clouds to be available (both 1.las and
>     2.las need to be available before I can start processing them).
>
>     I wanted to rule out whether it was possible to do something like
>     (on two separate machines):
>
>     1] entwine build -i 1.las -o out1
>     2] entwine build -i 2.las -o out2
>
>     And then merge the resulting EPT indexes into a "global" one:
>
>     entwine merge -i out1 out2 -o merged
>
>     But I don't think it's possible, correct?
>
>     -Piero
>
>
>
>     On 6/13/19 10:43 AM, Connor Manning wrote:
>>     The `subset` option lets each iteration of the build run a
>>     spatially distinct region, which can be trivially merged
>>     afterward, which sounds like what you're after.  Another option
>>     could be to simply use multiple indexes - potree can accept
>>     multiple input EPT sources, and a PDAL pipeline may have multiple
>>     EPT readers.
>>
>>     On Thu, Jun 13, 2019 at 6:46 AM Piero Toffanin
>>     <pt at masseranolabs.com <mailto:pt at masseranolabs.com>> wrote:
>>
>>         Hi there,
>>
>>         I have a question regarding the usage of Entwine and was
>>         hoping somebody could help me? The use case is merging point
>>         clouds that have been generated on different machines. Each
>>         of these point clouds is part to the same final dataset.
>>         Entwine works great with the current workflow:
>>
>>         entwine scan -i a.las b.las ... -o output/
>>
>>         for i in {a, b, ... }
>>
>>             entwine build -i output/scan.json -o output/ --run 1
>>
>>         The "--run 1" is done to lower the memory usage. On small
>>         datasets runtime is excellent, but with more models the
>>         runtime starts to increase quite a bit. I'm looking
>>         specifically to see if there are ways to speed the generation
>>         of the EPT index. In particular, since I generate the various
>>         LAS files on different machines, I was wondering if there was
>>         a way to let each machine contribute its part of the index
>>         from the individual LAS files (such index mapped to a network
>>         location) or if a workflow is supported in which each machine
>>         can build its own EPT index and then merge all EPT indexes
>>         into one? I don't think this is possible, but wanted to check.
>>
>>         Thank you for any help,
>>
>>         -Piero
>>
>>
>>         _______________________________________________
>>         pdal mailing list
>>         pdal at lists.osgeo.org <mailto:pdal at lists.osgeo.org>
>>         https://lists.osgeo.org/mailman/listinfo/pdal
>>
>     -- 
>
>     *Piero Toffanin*
>     Drone Solutions Engineer
>
>     masseranolabs.com <https://www.masseranolabs.com>
>     piero.dev <https://www.piero.dev>
>
>
-- 

*Piero Toffanin*
Drone Solutions Engineer

masseranolabs.com <https://www.masseranolabs.com>
piero.dev <https://www.piero.dev>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/pdal/attachments/20190613/52c65285/attachment.html>


More information about the pdal mailing list