[pdal] looping multiple bounding coordinates in PDAL Pipeline

Nicolas Cadieux nicolas.cadieux at archeotec.ca
Tue Dec 10 10:10:33 PST 2019


Hi,

I am sending this small script as an inspiration.  This is a 
single-threaded the python program that

 1. Scans a directory for a list of .tif file and makes a list of those
    tif files.
 2. Creates an output directory
 3. Build a gdal command line (export the tif to a xyz file) inside the
    function called "gdal_translate2XYZ".
     1. shell = subprocess.Popen(r'c:\OSGeo4W64\OSGeo4W.bat',
        stdin=subprocess.PIPE)
     2. shell.communicate(gdal_translateCmd)
     3. These two above line basically send the command to the OSGeo4W
        shell. (Install Qgis if you don't have it).
 4. Runs this command in a loop (the loop is created by using "map" or
    "p.map" function that operates on the list of tif files)
 5. If you comment out the (if __name__ == '__main__': ) section, the
    commands will run in parallel depending on the size of the pool
    variable.  pool = 6 = 6 parallel commands.  It is easier to debug if
    you  develop in single thread.

The idea is to build your script using python and to run it calling PDAL 
from python.  The Fiona python library with the Shapely library or the 
more complete geopandas library can help manipulate the shapefile.

Hope this helps!

Nicolas

https://fiona.readthedocs.io/en/stable/

https://shapely.readthedocs.io/en/stable/manual.html

http://geopandas.org/


On 2019-12-08 8:09 p.m., Jason McVay wrote:
> I'm looking for some advice on the best way/how to loop in thousands 
> of bounding coordinates into a pdal pipeline.
>
> I have a csv (and a geojson) of several thousand min/max x/y and a 
> unique ID. The AOI's are not very big, so the pipeline runs quickly, 
> but there are a lot of AOIs to capture! I'm querying an entwine 
> dataset, the extent of which is national, so I'm limiting the data 
> with a bounding box of each AOI.
>
> My pipeline currently runs HAG and Ferry Z filter, then uses the 
> gdal.writer to make a GeoTiff at 1m resolution. It works perfectly 
> when I manually enter in a set of test coordinates. How can I scale 
> this to loop and update the bounds automatically?
>
> I'm running this locally on a MacBook Pro.
>
> Thank you, any advice is appreciated!
>
>
> Jason McVay
>
> MS Geography, Virginia Tech
> BA Environmental Studies, University of Montana
> www.linkedin.com/in/jasonmcvay86/ 
> <http://www.linkedin.com/in/jasonmcvay86/>
>
>
> _______________________________________________
> pdal mailing list
> pdal at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/pdal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/pdal/attachments/20191210/7b0a0742/attachment-0001.html>
-------------- next part --------------
# -*- coding: utf-8 -*-
"""
Created on Thu Jan 17 14:40:41 2019
@author: Nicolas Cadieux
njacadieux.gitlab at gmail.com
"""
import os
import subprocess
from osgeo import gdal
from multiprocessing import Pool
import multiprocessing
import time
from collections import defaultdict
import gdalnumeric

#variable a changer dans script
startTime = time.time()
scanDirectory_and_Subdirectories = 'no' #no or yes
extensionFiltre = '.tif'
strFileFilter = '' # Mettre quelques chose si on veux trouver un string
inputPath = r'C:\temp'


def createDirectory(path):
    if os.path.isdir(path):
        pass
    else:
        try: os.mkdir(path)
        except (WindowsError):
            print 'WindowsError'
            
def gdal_translate2XYZ(inFilePath_Name):

    inFilePath = os.path.dirname(inFilePath_Name)
    inFileName = os.path.basename(inFilePath_Name)
    outFileName = inFileName [0:-4] + '.xyz'
    multipleOutPath = os.path.join(inFilePath,'xyz_output')
    outFilePath_Name = os.path.join(multipleOutPath, outFileName)
#    targetEPSG = sourceEPSG
    outputnoData = str(-32768)

    #create one output Directories per directory or a single Global Output directory
    createDirectory(multipleOutPath)
    #createDirectory()
    
    gdal_translateCmd = \
    'gdal_translate'\
    +' -a_nodata '+outputnoData\
    +' -of XYZ'\
    +' '+inFilePath_Name\
    +' '+outFilePath_Name+'\n'
    
    shell = subprocess.Popen(r'c:\OSGeo4W64\OSGeo4W.bat', stdin=subprocess.PIPE)
    shell.communicate(gdal_translateCmd)
#    return inFileName, 'Done'
#    return gdal_translateCmd

############################################################___MAIN___#####################    
shortNameList = []

for (dirPath, subDirectoriesList, fileNamesList) in os.walk(inputPath):
    for files in fileNamesList:
        if files.lower().endswith(extensionFiltre) and strFileFilter.lower() in files.lower():# and files.lower().startswith("variable")
            shortList = os.path.join(dirPath, files)
            shortNameList.append(shortList)

    if scanDirectory_and_Subdirectories == 'no':
        break #this break will exit the loop and limite to 1 directory only
    elif scanDirectory_and_Subdirectories == 'yes':
        pass
    else:
        raw_input(("You must choose '"'yes'"' or '"'no'"' in the scanDirectory_and_Subdirectories variable"))


#############NONE MULTITREADED OR TESTING############## 


map(gdal_translate2XYZ, shortNameList) #Single thread
print 'done in testing mode'
        
#Uncomment for multithearded
#if __name__ == '__main__':
#    p = Pool(6)
#    for x in p.imap_unordered(gdal_translate2XYZ, shortNameList):
#        print x
#    p.close()
#    p.join()
#    stopTime = time.time()
#    totalTime = (stopTime-startTime)/60
#    print 'done: ',totalTime, 'minutes'


More information about the pdal mailing list