[pdal] Does Entwine support distributed builds?
Piero Toffanin
pt at masseranolabs.com
Thu Jun 13 08:16:31 PDT 2019
Hey Connor,
thanks for the reply. I have looked at the subset option and I think it
would work well for the case where I have already computed all the
models. For example if I have a folder with:
1.las
2.las
...
Then I could spin four machines and do:
1] entwine build -i 1.las 2.las --subset 1 4 -o out1
2] entwine build -i 1.las 2.las --subset 2 4 -o out2
3] entwine build -i 1.las 2.las --subset 3 4 -o out3
4] entwine build -i 1.las 2.las --subset 4 4 -o out4
Then merge the results. I've noticed two things with this. It seemed
that as the number of input files increased, the memory and time
required to create each subset seemed increased also (that's why I opted
to use scan + build --run 1). The second is that I need to wait for all
point clouds to be available (both 1.las and 2.las need to be available
before I can start processing them).
I wanted to rule out whether it was possible to do something like (on
two separate machines):
1] entwine build -i 1.las -o out1
2] entwine build -i 2.las -o out2
And then merge the resulting EPT indexes into a "global" one:
entwine merge -i out1 out2 -o merged
But I don't think it's possible, correct?
-Piero
On 6/13/19 10:43 AM, Connor Manning wrote:
> The `subset` option lets each iteration of the build run a spatially
> distinct region, which can be trivially merged afterward, which sounds
> like what you're after. Another option could be to simply use
> multiple indexes - potree can accept multiple input EPT sources, and a
> PDAL pipeline may have multiple EPT readers.
>
> On Thu, Jun 13, 2019 at 6:46 AM Piero Toffanin <pt at masseranolabs.com
> <mailto:pt at masseranolabs.com>> wrote:
>
> Hi there,
>
> I have a question regarding the usage of Entwine and was hoping
> somebody could help me? The use case is merging point clouds that
> have been generated on different machines. Each of these point
> clouds is part to the same final dataset. Entwine works great with
> the current workflow:
>
> entwine scan -i a.las b.las ... -o output/
>
> for i in {a, b, ... }
>
> entwine build -i output/scan.json -o output/ --run 1
>
> The "--run 1" is done to lower the memory usage. On small datasets
> runtime is excellent, but with more models the runtime starts to
> increase quite a bit. I'm looking specifically to see if there are
> ways to speed the generation of the EPT index. In particular,
> since I generate the various LAS files on different machines, I
> was wondering if there was a way to let each machine contribute
> its part of the index from the individual LAS files (such index
> mapped to a network location) or if a workflow is supported in
> which each machine can build its own EPT index and then merge all
> EPT indexes into one? I don't think this is possible, but wanted
> to check.
>
> Thank you for any help,
>
> -Piero
>
>
> _______________________________________________
> pdal mailing list
> pdal at lists.osgeo.org <mailto:pdal at lists.osgeo.org>
> https://lists.osgeo.org/mailman/listinfo/pdal
>
--
*Piero Toffanin*
Drone Solutions Engineer
masseranolabs.com <https://www.masseranolabs.com>
piero.dev <https://www.piero.dev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/pdal/attachments/20190613/d61898ef/attachment.html>
More information about the pdal
mailing list