[Liblas-devel] Thoughts on a messy bug
Howard Butler
hobu.inc at gmail.com
Tue Jul 28 11:53:07 EDT 2009
All,
The LizardTech folks found a particularly nasty bug, although I'm not
sure they quite grasped how miserable it was <http://liblas.org/ticket/136
>. The problem comes down both an interpretation of the LAS
specification and some lazy programming on my part.
The interpretation resided in the length of the LAS header. We read
the spec to mean that a LAS header was the required bits (227 bytes) +
the VLRs + (2 pad bytes if the minor version was 1.0). The spec is
unclear if you want to write extra space in the header (accounting for
the pad bytes as necessary) and always depending upon the DataOffset
to determine where to start reading/writing points. Because of this
assumption, my lazy ass used the dataoffset as the bit to do
accounting for writing VLRs. Additionally, I was using this shorthand
accounting to pass information into the reader/writer about where to
start reading/writing. This meant that things could get out of sync
and it was impossible to account for header pad space.
At any rate, there are many files out in the wild that *do* have extra
space at the end of the VLRs before the beginning of the points. We
need to be able to read them no matter what, and it would be nice to
be able to write in such a fashion as well. I have a patch that
implements a fix for #136 and supports writing extra padding when the
dataoffset is greater than the space the VLRs would require (and
accounting for 2 pad bytes for minor version 1.0). I wrote this
against the 1.2 branch, and all tests pass.
The hesitation involves:
1) the patch adds a new method to the C/Python API because it is now
necessary to be able to manually set the dataoffset. No C++ API has
changed, however. Do we release a minor release with a C API addition?
2) the patch changes nearly 800 lines of code and feels quite
substantial (but much simpler, frankly)
3) the patch represents a significant behavioral change, especially
when writing files. The user can now set the dataoffset to whatever
they wish, and when VLR writing happens, it checks if it has enough
space. If it doesn't, the writer expands the header for them to fit.
If the user sets the dataoffset so large that there's no problem
fitting the VLRs in, \0's are written in between the end of the VLRs
to the dataoffset byte.
My questions are:
a) should we release this fix plus the rest of the 1.2 stuff as 1.2.1
late this week <http://liblas.org/milestone/1.2.1>
b) should this constitute a major release (I would propose to call it
1.4 so as to not confuse with the LAS spec due to no 1.3 support in
trunk currently).
Howard
More information about the Liblas-devel
mailing list