<div dir="ltr">Hi Howard:<br><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">El mié, 28 dic 2022 a las 12:45, Howard Butler (<<a href="mailto:howard@hobu.co">howard@hobu.co</a>>) escribió:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>

<br>

> On Dec 27, 2022, at 9:28 PM, Ulises Ibarra <<a href="mailto:ulisesmartinibarra@gmail.com" target="_blank">ulisesmartinibarra@gmail.com</a>> wrote:<br>

> <br>

> In total I have 47 scans of that piece of rainforest. A question: What do you think about the processing time of a large file that contains the 47 scans: Could it take 24 hours X 47 files = 1128 hours?<br>

<br>

The AWS spot rate for a c7g.2xlarge in the Oregon region is $0.142. That's 8 cpus and 16gb of RAM. Naively splitting your 1128 compute hours over that (1128/8 * 0.142) brings up a total cost of $20.002 <br>

<br>

PDAL purposefully does not split up data and try to internally optimize the computing of pipelines because they are extremely sensitive to the various filters and their configurations. It is on users to divide and conquer on their own with PDAL. For filters.litree, that means breaking the data up and trying to find the filters.sample.radius setting that gives you good enough results without blowing up memory or computation time.<br></blockquote><div><br></div><div>

<span class="gmail-HwtZe" lang="en"><span class="gmail-jCAhz gmail-ChMk0b"><span class="gmail-ryNqvb">I thought it was my job to send filters.litree a file that already had a suitable subsampling, something like just a point inside a sphere with radius 2 cm.</span></span> <span class="gmail-jCAhz gmail-ChMk0b"><span class="gmail-ryNqvb">I have to read the paper or the litree code to understand, if I can understand, what's going on there.</span></span></span>  <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

Your pipeline with filters.litree is obviously not very efficient, but the cost in effort to optimize it for one-time compute jobs far outstrips the cost of parallelizing the computation in the cloud somewhere and being done with it. That math obviously changes if you need to process the entire rainforest with filters.litree :)<br></blockquote><div> </div><div>

<span class="gmail-HwtZe" lang="en"><span class="gmail-jCAhz gmail-ChMk0b"><span class="gmail-ryNqvb">Thank you Howard.</span></span></span></div><div><span class="gmail-HwtZe" lang="en"><span class="gmail-jCAhz gmail-ChMk0b"><span class="gmail-ryNqvb">Happy new year!<br></span></span></span>


</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

Howard<br>

<br>

<br>

<br>

<br>

</blockquote></div></div>