[pdal] looping multiple bounding coordinates in PDAL Pipeline

Nicolas Cadieux nicolas.cadieux at archeotec.ca
Mon Dec 9 15:09:11 PST 2019


Hi,
I will send him a similar python loop tomorrow for inspiration.  Did not have time to look at this today.
Nicolas 

> Le 9 déc. 2019 à 17:36, adam steer <adam.d.steer at gmail.com> a écrit :
> 
> 
> Hi Jason
> 
> Weighing in late here, it’s possible to cobble together fiona/shapely/pdal to loop through a bunch of polygons (or process them in parallel) and do what you need. It’s a task that’s on my list of things to do when I get time :)
> 
> That way you can assemble a processing pipeline which goes straight from some geometries to data, without waiting for the new PR..
> 
> Cheers,
> 
> Adam
> 
> 
> 
>> On Tue, 10 Dec 2019 at 07:42, Jason McVay <jasonmcvay09 at gmail.com> wrote:
>> Thanks Howard! I think this is the way to go. I would be interested in exploring the pull request version as well, but I may have to wait until after the holiday break to get to that.
>> Jason McVay
>> 
>> MS Geography, Virginia Tech
>> BA Environmental Studies, University of Montana
>> www.linkedin.com/in/jasonmcvay86/
>> https://twitter.com/jasonmcvay
>> 
>> "May your trails be crooked, winding, lonesome, dangerous, leading to the most amazing view"
>> - Ed Abbey
>> 
>> 
>>> On Mon, Dec 9, 2019 at 8:36 AM Howard Butler <howard at hobu.co> wrote:
>>> 
>>> 
>>>> On Dec 8, 2019, at 7:09 PM, Jason McVay <jasonmcvay09 at gmail.com> wrote:
>>>> 
>>>> I'm looking for some advice on the best way/how to loop in thousands of bounding coordinates into a pdal pipeline.
>>>> 
>>>> I have a csv (and a geojson) of several thousand min/max x/y and a unique ID. The AOI's are not very big, so the pipeline runs quickly, but there are a lot of AOIs to capture! I'm querying an entwine dataset, the extent of which is national, so I'm limiting the data with a bounding box of each AOI.
>>>> 
>>>> My pipeline currently runs HAG and Ferry Z filter, then uses the gdal.writer to make a GeoTiff at 1m resolution. It works perfectly when I manually enter in a set of test coordinates. How can I scale this to loop and update the bounds automatically?
>>>> 
>>>> I'm running this locally on a MacBook Pro.
>>>> 
>>>> Thank you, any advice is appreciated!
>>> 
>>> Jason,
>>> 
>>> PDAL doesn't multithread or operate in a parallel fashion for you. You must use external tools to do this yourself. I have had good success using GNU parallel or xargs on bash along with the Python multiprocessing library to achieve that.
>>> 
>>> You scenario would seem to fit that model quite well. Here's a GNU parallel example. In short, use your favorite scripting language (or sed/awk/cat) to write a script that contains all of the job entries you need to run (bounds entries are all the same in my example, but you should get the point:
>>> 
>>>> pdal pipeline pipe.json --readers.ept.filename="ept://http://path/to/location" --readers.ept.bounds="([-10063436.56, -10060190.36], [5038996.16, 5043062.79])" --writers.gdal.filename="hag_mean_henry_co.tif"
>>>> pdal pipeline pipe.json --readers.ept.filename="ept://http://path/to/location" --readers.ept.bounds="([-10063436.56, -10060190.36], [5038996.16, 5043062.79])" --writers.gdal.filename="hag_mean_howard_co.tif"
>>>> pdal pipeline pipe.json --readers.ept.filename="ept://http://path/to/location" --readers.ept.bounds="([-10063436.56, -10060190.36], [5038996.16, 5043062.79])" --writers.gdal.filename="hag_mean_james_co.tif"
>>>> pdal pipeline pipe.json --readers.ept.filename="ept://http://path/to/location" --readers.ept.bounds="([-10063436.56, -10060190.36], [5038996.16, 5043062.79])" --writers.gdal.filename="hag_mean_mike_co.tif"
>>> 
>>> 
>>> Then run that script:
>>> 
>>>> parallel -j 16 < jobs.txt
>>> 
>>> Filtering EPT resources with boundaries is a common desire. I recently added a pull request to master (not yet released) that allows you to specify filtering (for faster query) and cropping (eliminating an extra stage specification) for EPT resources. See https://github.com/PDAL/PDAL/pull/2771#issue-323371431 The goal with the approach in the pull request is to not have to change format of the bounding geometries to text simply to feed them into a pipeline. We may add similar capability to other drivers if it is indeed useful in other contexts. 
>>> 
>>> With the PR, you could express your query boundaries as an OGR query and then iterate through your EPT resources. The current PR implementation doesn't "split" by the polygons, however. We might need to add the same capability to filters.crop to achieve that. Feedback is appreciated so we can learn how people wish to use this.
>>> 
>>> Howard
>>> 
>> _______________________________________________
>> pdal mailing list
>> pdal at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/pdal
> 
> 
> -- 
> Dr. Adam Steer
> http://spatialised.net
> https://www.researchgate.net/profile/Adam_Steer
> http://au.linkedin.com/in/adamsteer
> http://orcid.org/0000-0003-0046-7236
> +61 427 091 712 ::  @adamdsteer
> 
> Suits are bad for business: http://www.spatialised.net/business-penguins/
> _______________________________________________
> pdal mailing list
> pdal at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/pdal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/pdal/attachments/20191209/109be19b/attachment-0001.html>


More information about the pdal mailing list