[postgis-users] Re: Re bug with GIST indexes

Mon Mar 17 09:45:04 PST 2003

>
>
>>I agree with you 100% - but also remember that the operators 
>>(like "&&") 
>>work on bounding boxes *not* the actual geometries.  Pretty soon (as 
>>soon as GEOS gets integrated), we'll see routines like:
>>
>>SELECT * FROM data WHERE data.mygeom && <geom> and 
>>intersects(data.geom, 
>><geom>)
>>
>>Meaning, use the index for a parital (BOX) solution, then the actual 
>>function to get the correct solution (GEOMETRY).
>>    
>>
>
>I don't disagree with the format of the query - for the example above I
>can see why that would be completely valid. But in the case of the @
>operator it's verbal definition would have to be written as 'roughly
>indicates whether A contains B' and then having to verify with another
>function just seems plain wrong...
>  
>

The operators (like "@" and "&&") only work on the BOUNDING BOX of  the 
geometries, so they are finding a partial soultion.   The new GEOS 
functions are full solutions to the problem.  For example A "@" B is 
true if one bounding box is inside the other - contains(A,B) is true if 
one geometry is inside the other.

All the operators are fast (and indexable) approximations to underlying 
predicates.

I do agree, that the operators should give the correct (ie. no extras) 
solution.  I think this bug is do to the code in the consistent() function.

>>I've heard that they will be query re-write rules in postgresql 7.4 - 
>>have you looked into any of them?  I'm hoping that a user query like:
>>
>>SELECT * FROM data WHERE intersects(data.geom, <geom>)
>>can get auto-magically turned into an indexible statement 
>>like: SELECT * FROM data WHERE data.mygeom && <geom> and 
>>intersects(data.geom, 
>><geom>)
>>    
>>
>
>No, although that does look interesting, espectially if it could be
>controlled by entries that could be inserted by PostGIS into the system
>tables.
>
I'd love to find out more about this, its really powerfull to have 
something like:

SELECT * FROM <table>  WHERE distance(table.geom, <geom>) < 100

automagically turned into something like this that will use the index:

SELECT * FROM <table> WHERE distance(table.geom, <geom>) < 100 AND 
table.geom && expand_bbox(<geom>,100)

>Yes, this would make sense. However, what is the support for PostGIS 3D
>spaces like? Should the BOX3D be deprecated for BOX2D? It doesn't look
>difficult to extend the R-Tree code into 3 dimensions if people would
>want to use spatially indexed 3D queries.... But looks like JTS doesn't
>do 3D :(
>  
>
Yes - this is unfortunate that JTS doesnt really much is 3D space.  Most 
GIS is actually 2.5D anyways (like JTS).  To
have full 3d support, you need to be able to define volumes instead of 
polygons (areas) - this is really advanced stuff!

For example, imagine a square in 3d, on the XY plane.  Now take one of 
the corners and move its vertex straight up.  The
resulting shape is very difficult to describe!

>Hmmm, I see what you mean.... But this is actually the beginning of a
>very cunning plan. How about modifying Compress() to convert geometries
>to BOX2Ds by taking their bounding box, and modifying Decompress() to
>convert BOX2Ds back into BBOXONLY geometries?
>
>This then would mean that the index entries would be stored on the disk
>as BOX2Ds taking minimal space, while it would still be possible for the
>Consistent() routines to call functions that take two geometries as a
>parameter, plus it gets rid of the problem above where you would be
>storing excess information if you were directly storing the GEOMETRY
>type. The only thing I can see here is that every BOX2D would still need
>to have a constant SRID stored on disk which seems wrong given that the
>value will be constant for every index... Is there no way we could
>extract this information from the geometry_columns table given the
>information in the GiST entry structure? But I guess this would slow
>things down a bit?
>  
>
Ah - I see where you're comming from!
An alternative would be to have all the consistent() functions take 
BOX2Ds, and have the
geometry implementations convert the geometry to BOX2D and call the 
BOX2D versions.
I say this because the its very likely that the bbox_inside() functions 
will be called much more
often from the index than directly.  It seems like a fair amout of work to:
    1. pull BOX2D from index
    2. ensure SRID match
    3.  convert BOX2D -> GEOMETRY
    4.  hand off to GEOMETRY bbox function

Instead of
   1. pull BOX2D from index
   2. ensure SRID match
   3. hand-off to BOX2D function

It does suck to have to put the SRID in every BOX2D when its completely 
redundant.  To bad you couldnt
have meta-information stored at the top level of the index so you could 
verify a SRID match before the
index search even begins to traverse.

I think its a bit risky to try to put the information in the 
geometry_columns table.  
I know I have a bee-in-my-bonnet about auto-rewriting queries, but:

SELECT * FROM table WHERE table.geom && <geom>

converted to:

SELECT * FROM table WHERE table.geom && <geom> and 
assert_consistent_srid('table',<geom>)

where assert_consistent_srid() would verify that the geometry_columns 
entry's srid is the same as the <geom> SRID.  Of course, what do you do 
if there is no entry in the geometry_columns table?

dave