[libpc] Re: Dimension/Schema discussion
Howard Butler
hobu.inc at gmail.com
Mon May 9 12:05:52 EDT 2011
On May 9, 2011, at 11:01 AM, Michael P. Gerlek wrote:
> I agree -- this is something I've been aware of, but not thought through to
> any solutions yet. What we have doesn't scale very well, and now that
> we've got more formats/features coming online, the problem is becoming
> clearer. I guess you're just the first one to hit it hard enough to decide
> we need to fix it :-)
>
> If we went with your "mulit-index lookup, but called outside the PointBuffer
> loop" concept, one thing I'd like to see is we put back the concept of
> "readBegin/readEnd", which wrap the set of "read PointBuffer" calls. In
> this readBegin function, then, we could put the multi-index lookup code, so
> it would be truly only once per full read operation, as opposed to once per
> chunk of the full read.
Agreed.
>
> Making this change would break a lot of code, but the breakage would be
> largely "syntactic" and we've got good unit tests, so I'd not be at all
> concerned about trying it.
Agreed. It'll bust a bunch of stuff but AFAIK, we're our only users at this point. Unit tests should preserve our sanity.
>
> Unfortunately, I'm booked up all today (including the chunked zip code), am
> out of the office all day tomorrow, and have some other things on my plate
> for the next couple days after that.
>
> How much of a block is this for you right now? Do you want to just dive in
> now, or do you want to bounce some ideas around for a while, or wait for me
> to do it later on, or..?
Not such a blocker for me at the moment, but I'm going to add BAG, TerraSolid .bin, and start cribbing up a Spirit-based text parser in the not too distant future here (next month or so). I wasn't trying to put this on your plate, and I'd be willing to take it on, but I wanted to get some rationale and pushback on it since my picture of it isn't probably is clear as yours.
>
> -mpg
>
>
>> -----Original Message-----
>> From: Howard Butler [mailto:hobu.inc at gmail.com]
>> Sent: Monday, May 09, 2011 8:43 AM
>> To: Michael Gerlek
>> Cc: libpc at lists.osgeo.org
>> Subject: Dimension/Schema discussion
>>
>> Michael,
>>
>> An immediate thing I noticed while implementing a non-LAS-conforming
>> driver Friday is the desire to name my dimensions appropriately. The
> names
>> of course are set in a static array right now, and there isn't any way to
>> override them.
>>
>> My question is can you describe how what we have now is different than
>> what I had been cooking in libLAS? How was this design arrived at?
>>
>> In libLAS, I had a boost::multi_index that included a random lookup and a
>> name-based lookup. Because of libLAS' point-at-a-time nature, this meant
>> doing map look ups in the critical path and killed performance. But this
>> wouldn't be necessary with our PointBuffer-based approach, and dimension
>> positions within a schema could continue to be fetched from outside the
>> loop and passed into methods that would expect a position and a width to
>> fetch.
>>
>> We maintain an array of names, and a dimension type that is used to do
>> random lookups in the dimensions array of Schema. . The slots are named to
>> avoid doing so in the critical path, and we do our dimension name ->
>> dimension slot lookup outside loops where we're fetching data. Could we
>> not continue to do that with the SchemaLayout maintaining the multi_index?
>>
>> My complaint is that as we add more and more data providers, the
>> Dimension Type system we've currently outlined is going to start feeling
> like
>> quite a straight jacket. The QFIT driver really needs two "X" dimensions
>> because there's an X dimension of the sensor and X dimension of the
>> measurement, per-point. So these are logically X, if we're to follow the
>> dimension type system, but they can't be modeled that way because we
>> can't have two X's.
>>
>> What could happen if we threw out the dimension type system we have
>> now? What would be the consequences? What if the user "marked" their
>> XYZ dimensions (first three are to be assumed that if no dimensions are
>> marked) for use with bounds/windows/etc?
>>
>> Sorry for the still-a-bit-incompletely-formed-thoughts,
>>
>> Howard =
>
More information about the libpc
mailing list