[pdal] PDAL Python3 issues

Mon Jan 23 06:29:28 PST 2017

> On Jan 20, 2017, at 4:27 PM, Jean-Francois Prieur <jfprieur at gmail.com> wrote:
> 
> 
> I think there is a small typo in the line
> 
> pipeline = pdal.Pipeline(pipeline)
> should be
> pipeline = pdal.Pipeline(json)

Filed. https://github.com/PDAL/PDAL/issues/1476

> 
> When I try to execute the script, I get the following errors
> 
> >>> pipeline = pdal.Pipeline(json)
> >>> pipeline.validate()
> Warning 1: Cannot find pcs.csv
> True
> >>> pipeline.loglevel = 9
> >>> count = pipeline.execute()
> >>> arrays = pipeline.arrays
> RuntimeError: _ARRAY_API is not PyCObject object
> Segmentation fault

Hmm. I have tested the Python extension on both Python 2 and Python 3, and the Python extensions are built and tested as part of the Travis continuous integration tests [1]. I'm a bit stumped by this particular issue, and I have never seen any behavior like this before. Some wild guesses I have would be there's some mix up of Numpy headers and actual installed version, or there's somehow a Python 3.x runtime vs compiled-against-2.x numpy issue.

[1] https://travis-ci.org/PDAL/PDAL/jobs/193471435#L3786

> For the first warning, I have my GDAL_DATA path set and the pcs.csv file is there
> $ sudo find  / -name pcs.csv -type f
> /usr/share/gdal/2.1/pcs.csv
> 
> $ echo $GDAL_DATA
> /usr/share/gdal/2.1/

Can you set CPL_DEBUG=ON and PROJ_DEBUG=ON in your environment before running?

> I have installed gdal, pdal, python3-gdal, python3-numpy, python3-pdal so not too sure why the arrays command fails.

Is there a python3-pdal package now? 

> Any help is appreciated, trying to replace liblas as we have memory usage problems with it. When we read multiple LAS files (open and close thousands of LAS files) with liblas the memory just runs out eventually, even with a close() statement. Happens on both windows and linux (thought it was a windows dll problem perhaps). Need to solve this with PDAL and am pretty close ;)

A description of your workflow might also help. The Python extensions is really about making it convenient for people to access the point data of a particular PDAL-readable file. A common workflow we use is Python or Javascript build-up of a pipeline, and then push it off to `pdal pipeline` for execution (with some kind of process tasking queuing engine). Reading up lots of data into the Python process is likely to be fraught.

Howard

Howard