[postgis-users] question on gist performance

Mark Cave-Ayland mark.cave-ayland at ilande.co.uk
Fri Sep 14 03:24:01 PDT 2007


On Thu, 2007-09-13 at 15:26 +0200, Stefan Zweig wrote:
> Mark,
> 
> thank you very much for your response. As far as I understand, pre-computed bounding boxes are stored (by default) to each Geometry object (wherever that is) to "make bounding box based queries faster", see
> 
> http://main.merlin.com.ua/doc/postgis/docs/ch06.html#hasbbox
> http://main.merlin.com.ua/doc/postgis/docs/ch06.html#addbbox
> 
> A full scan wouldn't be necessary if hasbbox(the_geom) returns true, which would make *much* sense here. Obviously this cached information is not evaluated when determining the bounding box while RECHECKing? Is this a "bug"?

Sure. But the bounding box is stored as part of the heap geometry, so
for every index match PostgreSQL must load the entire geometry into
memory first and decompress it before it can get to the bounding box
info. If you have a geometry with several thousand points, the idea of
the BBOX cache is to ensure that SELECT extent(my_big_geom) returns in a
fixed amount of time, rather than having to linearly scan the entire set
of points within the geometry.

I'm not saying there definitely isn't a bug - did you try removing the
RECHECK clause as per my previous email? If that doesn't solve the
problem then send a test case and I will look into it.

> Mark, I have another question. I want to find out the *fastest* way to retrieve the binary content of Geometry objects to perform conversions later on. I found that SELECT the_geom returns a Geometry object in the hexewkb format and does not require any conversion operations by the database. Now, some questions occurred:
> 
> a) Each selected Geometry object seems always to be copied into memory before one can access it, even static object data. Is there any "clean" way to access the raw data stored in the associated table files?

If you want to delve into PostgreSQL innards, you could open the files
directly and read the contents - but this really isn't recommended for
beginners. Perhaps using a cursor would be useful to you?

> b) hexewkb data must always be converted to bytes first in order to get integers or doubles. Is there any way to store geometric data directly as ewkb or wkb (btw, this would save 50% disc space)?

All geometries are stored internally as binary - they are just converted
to/from HEXEWKB when accepting input from the user or displaying it back
to the user.

> c) double precision is not necessary to store coordinates if you know the SRID and the required accuracy. Is there any way to store geometric data in a format other than double, for example, float or int?

No is the quick answer to this.

> d) How can I find out what disc files belong to a table, and how much disc space a particular table needs?

You'll probably find this reference handy:
http://www.postgresql.org/docs/8.2/interactive/functions-admin.html.


ATB,

Mark.

-- 
ILande - Open Source Consultancy
http://www.ilande.co.uk





More information about the postgis-users mailing list