[pdal] PLang status

Wed Mar 14 12:23:48 EDT 2012

> > I plan to add support for adding a script to the PipelineXML system.
> 
> Being able to reference external .py files would be helpful here too, especially since Python's syntax is whitespace sensitive and XML
> is not without CDATA entries.

Yes, my intent was to do it via a filename (probably path-relative to the dir containing the XML pipeline, I think that's our convention?).  I'm not cool enough to parse CDATA successfully.

> We will definitely need scoped names. The rules for for fetching Dimension names from Schema are:
> [...]

OK, will do.  It's really as simple as just adding another name to the dictionary...

> Is ins['X'] an array of dereferencable numpy pointers to the X dimension for the PointBuffer, or are data copied?
> Is ins a reference and outs a copy?

Yes, exactly:

Ins['X'] is a numpy array which points into the PointBuffer's bytes, so it is a free operation.

Outs['Y'] is a numpy array which is owned by Python, so each element Y[i] needs to be copied to the output PointBuffer, where ech Y[i] corresponds to one field of one point, and Y.size()==PointBuffer.getNumPoints(). The output has to be owned by Python, because in the worst case numpy makes array copies behind the scenes on us. I hope eventually to be able to figure out a way to make this process faster.

> > Note the ability to put comments and printfs in the script.
> What happens is pure eval()'d Python, right? I would expect that any valid Python script for your environment should be good.

Correct: we compile the script source into bytecode once, then execute it multiple times (once for each PointBuffer). I was just emphasizing the point that we have a completely open and arbitrary scripting engine now.

-mpg