[postgis-users] Improving Performance

Thu Feb 19 13:14:26 PST 2004

Ralph Mason wrote:

> I think it wouldn't be to difficult to benchmark the bounding box being 
> floats / doubles / gone altogether.  The optimum setting is likely to be 
> based on the actual machine it's running on.  I wouldn't be surprised if 
> no bounding box was actually the fastest on a modern fast processor with 
> a good sized cache.

You could be right - for simple geometries with just a few points, 
constructing the bounding box 'when required' could be more efficient 
than storing the geometries.

This is certainly true for index searches (since the bounding box is 
pre-computed and stored in the index).  This is the normal use for 
bounding boxes operations.  The only other real uses for them are (1) 
the "funny" bbox operations (contains/within) which arent used very 
often and (2) envelope() function calls.  The impact shouldnt be too bad.

Unfortunately, for sequencial scans, we would have to compute the 
bounding box for all the geometries in a table for each query.  This is 
a fairly high-cost operation - its unlikely that the CPU time spent 
computing these will be lower than the disk access time for loading the 
extra 16 bytes/geometry.

> I don't think it needs to lead to a proliferation of types.  Just 
> another type geometry_2d or something like that.  I am also in favor or 
> removing the projection, so that functions working with 2d geometrys 
> don't need to consider it.
> It would be interesting to know for sure, but I suspect that most users 
> of postgis are using 2d geometries and all their data is in one 
> projection.  Meaning that a faster smaller 2d type would probably make 
> up the bulk of the use, with the full geometry being used to massage 
> data into the smaller type.

This is why I'm promoting the WKB version.  The WKB version supports 
both 2d and 3d points.  The OGC SF SQL has the full definition of WKB, 
but here's a 2d point and 3d point:

2d point (25 bytes):
<int32> // postgresql variable-length datatype overhead
<byte>  //endian flag
<int32> // type ("2d point")
<double>// "X"
<double>// "Y"

3d point (33 bytes):
<int32> // postgresql variable-length datatype overhead
<byte> //endian flag
<int32> //type ("3d point")
<double>// "X"
<double>// "Y"
<double>// "Z"

The definition of things like linestring, polygon, multipoint, 
multilinestring, multipolygon, and geometrycollection are pretty much 
straight forward.

If we do full support for WKB, then you'll be able to store 2d and 3d 
geometries natively and effienctly!

NOTE: I havent put an SRID (int32) in these structures.

Logically, you should still be able to do something like this:

SELECT    asBinary(Transform(setSRID(<wkb>,111), 222)) ;

This will convert the WKB to GEOMETRY, give it an SRID of 111.  The 
geometry would then be transformed to SRID 222.  Then its converted back 
to a WKB.

You'll also be able to do things like:

SELECT asBinary(  intersection(<WKB 1>, <WKB 2>)) ;

> Anyway I am keen to support any effort, with time and or a donation.

Excellent!

dave