[pdal] Scheming

Mon Jul 11 15:22:50 EDT 2011

I think the key intent was that during the actual read function I wanted to
have a fixed structure that did not require lookups of a field at anything
worse than O(1).

The introduction of field-enums-as-ints was intended to accomplish this.
But I agree, there is a problem with a given field being used more than
once, or a given field being re-used with a different data type (we did
consider adding the datatype along with the "name", but that makes the enum
worse than it is).

(  Another piece of the puzzle to keep in mind is this:  a point, consisting
of N different fields, is given to a stage.  As far as that stage is
concerned, it likely will have no idea what the various fields actually are
-- it will typically only want to touch a few of them, if it can "find" the
kind of field it knows how to deal with for.  For example, the crop filter
only understands X,Y,Z and will treat the rest as pass-throughs.  )

One approach might be to do this: before the main read-point loop in a
stage, find the fields of interest and store them in an object.  This may be
worse than O(1), but it doesn't matter if it is outside the loop.  Then,
during the loop, only that new object is used which stores the indexes in a
way that can be found in O(1).  Look at the variables fieldX,fieldY,fieldZ
in CropFilter::processBuffer -- I did it there just for notational
convenience, but that's the idea I'm thinking of, although we would sort of
provide an infrastructure for it.

-mpg

> -----Original Message-----
> From: pdal-bounces at lists.osgeo.org [mailto:pdal-bounces at lists.osgeo.org]
> On Behalf Of Howard Butler
> Sent: Sunday, July 03, 2011 12:32 PM
> To: pdal at lists.osgeo.org
> Subject: [pdal] Scheming
> 
> 
> On Jul 2, 2011, at 7:39 AM, Howard Butler wrote:
> 
> > The Schema's dimension management needs to be addressed now.
> Consider:
> >
> > - Scaling filter goes ints -> doubles.  This currently adds XYZ dims as
doubles
> and attempts to remove XYZ int dimensions by marking the m_indexTable
> > - Reprojection filter happens
> > - descaling filter goes doubles -> ints.  This adds XYZ dims back as
ints.
> Again the deletion doesn't quite work though.
> >
> > The net effect of this is our schema ends up with three sets of XYZ dims
in
> the m_dimensions vector, but only one Field_X is valid in the
m_indexTable.
> >
> > Ideas? Do we need m_indexTable at all?
> 
> 
> Moving this to the list...
> 
> Some other questions:
> 
> - It seems SchemaLayout exists only to calculate the cumulative size of
all of
> the DimensionLayouts, but the order of the DimensionLayouts (to be added
> to the SchemaLayout) is implicitly determined by the dimension order as
> they are added to the pdal::Schema's vector.  Can you clarify the intent
of
> separating Schema/SchemaLayout and Dimension/DimensionLayout?  I'm a
> bit confused as to the advantages of keeping these separate as opposed to
> information that is updated on-the-fly as Dimensions are created or added
to
> a Schema object.
> 
> - Do we want to bake in XML serialization of schema/dimension into those
> classes or keep them separate as we have them now?
> 
> Howard
>  _______________________________________________
> pdal mailing list
> pdal at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/pdal