[postgis-devel] LWGEOM -- inital version ready for testing
David Blasby
dblasby at refractions.net
Fri May 7 09:40:07 PDT 2004
Mark Cave-Ayland wrote:
> Ah I see, although this wasn't what I was thinking when I wrote the
> email! I was thinking about that since a point has its own bounding box
> then would it be a waste of space to specify the same information
> contained with the point from the bounding box.
True, but there is still significant overhead of looking at the
geometry, checking to see if its a point, then constructing the BOX3D,
then converting it to a BOX2DFLOAT4. It not much overhead, but it adds
up for the nested queries.
> My only concern would that LWGEOM would fail an OGC regression test that
> used data up to the full precision of a double. I also know that some of
> our lat/long datasets can have precision because they were geocoded
> against hi-resolution raster imagery. Although with LWGEOM, I thought
> that internally everything would still be a double except for the
> bounding boxes in the GiST index? Then again, thinking about it now, I
> suppose we would need the 'true' double precision bounding box stored in
> the actual geometry for the RECHECK operator.
None of the operators are in the OGC spec. The one's I'm talking about
are "&&", and the almost-never used ones like '<&'. Personally, I'd
like to see all but && and maybe contains/contained removed.
If you were to do a intersects(g1,g2), you'll be using the actual
double-precision coordinates.
The idea of the operators is to do two-stage queries:
SELECT ... FROM <table> WHERE g && <geometry>
AND <actual function>;
As long as the first stage (&&) give you correct results, the 2nd stage
will always work. The way I construct the box2d, the && operator will
never give you a false negative.
>>You should find that it performs a wee bit slower than
>>postgis, but it
>>takes *significantly* less space.
>
>
> Interesting. So while it may be slightly slower to begin with, it should
> still scale better for the reason that there is less data to pull off
> disk, no?
Well - this depends on how well your system disk cache is working. Its
really hard to test true speed because it takes a LONG time to be
reasonably sure the cache is empty (know any way to flush the cache
under linux?).
I'm just in process of creating a table with 3 geometry columns in it -
a 2 point LINESTRING and two points. There's going to be 17,000,000
entries in it.
Under PostGIS, the table would take 7.4 Gb. Under LWGEOM, it only takes
a little over 1Gb. You can imagine how much faster a sequential scan
would go on the LWGEOM. For index searches, it much more likely that
the disk cache would actually help.
If the LWGEOM stuff was a little more tested, I'd use it - it takes 4
days of processing to make the table, so I dont want to screw it up.
dave
More information about the postgis-devel
mailing list