[pdal] looping multiple bounding coordinates in PDAL Pipeline

Jason McVay jasonmcvay09 at gmail.com
Tue Dec 10 10:51:56 PST 2019


Thank you Nicolas!

Jason McVay

MS Geography, Virginia Tech
BA Environmental Studies, University of Montana
www.linkedin.com/in/jasonmcvay86/
https://twitter.com/jasonmcvay

*"May your trails be crooked, winding, lonesome, dangerous, leading to the
most amazing view"*
- Ed Abbey


On Tue, Dec 10, 2019 at 1:44 PM Nicolas Cadieux <
nicolas.cadieux at archeotec.ca> wrote:

> Hi,
>
> This will print the bounds for each object in a shapefile.
>
> Nicolas
> On 2019-12-10 9:49 a.m., Jason McVay wrote:
>
> Hi Adam, and Nicolas,
>
> I think I understand conceptually how using shapely/geopandas to loop in
> coordinates to a pdal pipeline should work. But if you can pass along an
> example that would be greatly appreciated!
>
> Thanks again,
>
> Jason McVay
>
> MS Geography, Virginia Tech
> BA Environmental Studies, University of Montana
> www.linkedin.com/in/jasonmcvay86/
> https://twitter.com/jasonmcvay
>
>
> *"May your trails be crooked, winding, lonesome, dangerous, leading to the
> most amazing view"*
> - Ed Abbey
>
>
> On Mon, Dec 9, 2019 at 6:09 PM Nicolas Cadieux <
> nicolas.cadieux at archeotec.ca> wrote:
>
>> Hi,
>> I will send him a similar python loop tomorrow for inspiration.  Did not
>> have time to look at this today.
>> Nicolas
>>
>> Le 9 déc. 2019 à 17:36, adam steer <adam.d.steer at gmail.com> a écrit :
>>
>> 
>> Hi Jason
>>
>> Weighing in late here, it’s possible to cobble together
>> fiona/shapely/pdal to loop through a bunch of polygons (or process them in
>> parallel) and do what you need. It’s a task that’s on my list of things to
>> do when I get time :)
>>
>> That way you can assemble a processing pipeline which goes straight from
>> some geometries to data, without waiting for the new PR..
>>
>> Cheers,
>>
>> Adam
>>
>>
>>
>> On Tue, 10 Dec 2019 at 07:42, Jason McVay <jasonmcvay09 at gmail.com> wrote:
>>
>>> Thanks Howard! I think this is the way to go. I would be interested in
>>> exploring the pull request version as well, but I may have to wait until
>>> after the holiday break to get to that.
>>> Jason McVay
>>>
>>> MS Geography, Virginia Tech
>>> BA Environmental Studies, University of Montana
>>> www.linkedin.com/in/jasonmcvay86/
>>> https://twitter.com/jasonmcvay
>>>
>>> *"May your trails be crooked, winding, lonesome, dangerous, leading to
>>> the most amazing view"*
>>> - Ed Abbey
>>>
>>>
>>> On Mon, Dec 9, 2019 at 8:36 AM Howard Butler <howard at hobu.co> wrote:
>>>
>>>>
>>>>
>>>> On Dec 8, 2019, at 7:09 PM, Jason McVay <jasonmcvay09 at gmail.com> wrote:
>>>>
>>>> I'm looking for some advice on the best way/how to loop in thousands of
>>>> bounding coordinates into a pdal pipeline.
>>>>
>>>> I have a csv (and a geojson) of several thousand min/max x/y and a
>>>> unique ID. The AOI's are not very big, so the pipeline runs quickly, but
>>>> there are a lot of AOIs to capture! I'm querying an entwine dataset, the
>>>> extent of which is national, so I'm limiting the data with a bounding box
>>>> of each AOI.
>>>>
>>>> My pipeline currently runs HAG and Ferry Z filter, then uses the
>>>> gdal.writer to make a GeoTiff at 1m resolution. It works perfectly when I
>>>> manually enter in a set of test coordinates. How can I scale this to loop
>>>> and update the bounds automatically?
>>>>
>>>> I'm running this locally on a MacBook Pro.
>>>>
>>>> Thank you, any advice is appreciated!
>>>>
>>>>
>>>> Jason,
>>>>
>>>> PDAL doesn't multithread or operate in a parallel fashion for you. You
>>>> must use external tools to do this yourself. I have had good success using
>>>> GNU parallel or xargs on bash along with the Python multiprocessing library
>>>> to achieve that.
>>>>
>>>> You scenario would seem to fit that model quite well. Here's a GNU
>>>> parallel example. In short, use your favorite scripting language (or
>>>> sed/awk/cat) to write a script that contains all of the job entries you
>>>> need to run (bounds entries are all the same in my example, but you should
>>>> get the point:
>>>>
>>>> pdal pipeline pipe.json --readers.ept.filename="
>>>> ept://http://path/to/location" --readers.ept.bounds="([-10063436.56,
>>>> -10060190.36], [5038996.16, 5043062.79])"
>>>> --writers.gdal.filename="hag_mean_henry_co.tif"
>>>> pdal pipeline pipe.json --readers.ept.filename="
>>>> ept://http://path/to/location" --readers.ept.bounds="([-10063436.56,
>>>> -10060190.36], [5038996.16, 5043062.79])"
>>>> --writers.gdal.filename="hag_mean_howard_co.tif"
>>>> pdal pipeline pipe.json --readers.ept.filename="
>>>> ept://http://path/to/location" --readers.ept.bounds="([-10063436.56,
>>>> -10060190.36], [5038996.16, 5043062.79])"
>>>> --writers.gdal.filename="hag_mean_james_co.tif"
>>>> pdal pipeline pipe.json --readers.ept.filename="
>>>> ept://http://path/to/location" --readers.ept.bounds="([-10063436.56,
>>>> -10060190.36], [5038996.16, 5043062.79])"
>>>> --writers.gdal.filename="hag_mean_mike_co.tif"
>>>>
>>>>
>>>>
>>>> Then run that script:
>>>>
>>>> parallel -j 16 < jobs.txt
>>>>
>>>>
>>>> Filtering EPT resources with boundaries is a common desire. I recently
>>>> added a pull request to master (not yet released) that allows you to
>>>> specify filtering (for faster query) and cropping (eliminating an extra
>>>> stage specification) for EPT resources. See
>>>> https://github.com/PDAL/PDAL/pull/2771#issue-323371431
>>>> <https://smex-ctp.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2fgithub.com%2fPDAL%2fPDAL%2fpull%2f2771%23issue%2d323371431&umid=1ef705ee-0b38-4b1b-9373-2ccf4d0eb417&auth=ab4b424674be62c9f8f9e1c1a31e433d534186a3-f8290d8713ab4ed5b3666d9bbdf34b2c63f8b5c8> The
>>>> goal with the approach in the pull request is to not have to change format
>>>> of the bounding geometries to text simply to feed them into a pipeline. We
>>>> may add similar capability to other drivers if it is indeed useful in other
>>>> contexts.
>>>>
>>>> With the PR, you could express your query boundaries as an OGR query
>>>> and then iterate through your EPT resources. The current PR implementation
>>>> doesn't "split" by the polygons, however. We might need to add the same
>>>> capability to filters.crop to achieve that. Feedback is appreciated so we
>>>> can learn how people wish to use this.
>>>>
>>>> Howard
>>>>
>>>> _______________________________________________
>>> pdal mailing list
>>> pdal at lists.osgeo.org
>>> https://lists.osgeo.org/mailman/listinfo/pdal
>>
>>
>>
>> --
>> Dr. Adam Steer
>> http://spatialised.net
>> <https://smex-ctp.trendmicro.com:443/wis/clicktime/v1/query?url=http%3a%2f%2fspatialised.net&umid=73b42ec6-c163-4304-9ef2-a1d38eac8f79&auth=72ce7397d0db234fdd09ad1e9584ffcc03ba0336-066b12425d4ebeb1bf4522439ac18d619101f119>
>> https://www.researchgate.net/profile/Adam_Steer
>> http://au.linkedin.com/in/adamsteer
>> http://orcid.org/0000-0003-0046-7236
>> +61 427 091 712 ::  @adamdsteer
>>
>> Suits are bad for business: http://www.spatialised.net/business-penguins/
>> _______________________________________________
>> pdal mailing list
>> pdal at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/pdal
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/pdal/attachments/20191210/eeda7fac/attachment-0001.html>


More information about the pdal mailing list