[postgis-users] Documenting performance tip

strk at refractions.net strk at refractions.net
Fri Feb 4 01:02:57 PST 2005


Markus, thanks for the insight. 
A side-effect of your suggestion is that users will end up
actually using box2d type in their tables.
I think they should be warned that those column won't retain
their original precision between dump/reloads.
More so now that box2d canonical ascii output has been changed
to use default precision (%g).
What do you think ?

--strk;

On Thu, Feb 03, 2005 at 02:11:24PM +0100, Markus Schaber wrote:
> Hi, Strk,
> 
> strk at refractions.net schrieb:
> 
> > Beside writers skills this needs full understanding of the problem.
> > If you can document the problem and the tip it would be helpful.
> 
> Okay, I'll give it a try:
> 
> - snip -
> 
> *** Performance tip for small tables of large geometries ***
> 
> * Problem Description *
> 
> Current PostgreSQL versions (including 8.0) suffer from a query
> optimizer weakness regarding TOAST tables. TOAST tables are a kind of
> "extension room" used to store large (in the sense of data size) values
> that do not fit into normal data pages (like long texts, images or
> complex geometries with lots of vertices), see
> http://www.postgresql.org/docs/8.0/static/storage-toast.html for more
> information).
> 
> The problem appears if you happen to have a table with rather large
> geometries, but not too much rows of them (like a table containing the
> boundaries of all european countries in high resolution). Then the table
> itsself is small, but it uses lots of TOAST space. In our example case,
> the table itsself had about 80 rows and used only 3 data pages, but the
> TOAST table used 8225 pages.
> 
> Now issue a query where you use the geometry operator && to search for a
> bounding box that matches only very few of those rows. Now the query
> optimizer sees that the table has only 3 pages and 80 rows. He estimates
> that a sequential scan on such a small table is much faster than using
> an index. And so he decides to ignore the GIST index. Usually, this
> estimation is correct. But in our case, the && operator has to fetch
> every geometry from disk to compare the bounding boxes, thus reading all
> TOAST pages, too.
> 
> To see whether your suffer from this bug, use the "EXPLAIN ANALYZE"
> postgresql command. For more information and the technical details, you
> can read the thread on the postgres performance mailing list:
> http://archives.postgresql.org/pgsql-performance/2005-02/msg00030.php
> 
> * Workarounds *
> 
> The PostgreSQL people are trying to solve this issue by making the query
> estimation TOAST-aware. For now, here are two workarounds:
> 
> The first workaround is to force the query planner to use the index.
> Send "SET enable_seqscan TO off;" to the server before issuing the
> query. This basically forces the query planner to avoid sequential scans
> whenever possible. So it uses the GIST index as usual. But this flag has
> to be set on every connection, and it causes the query planner to make
> misestimations in other cases, so you should "SET enable_seqscan TO on;"
> after the query.
> 
> The second workaround is to make the sequential scan as fast as the
> query planner thinks. This can be achieved by creating an additional
> column that "caches" the bbox, and matching against this. In our
> example, the commands are like:
> 
> SELECT addGeometryColumn('mydb','mytable','bbox','4326','GEOMETRY','2');
> UPDATE mytable set bbox = setsrid(box3d(geom_column)::geometry, 4326);
> 
> Now change your query to use the && operator against bbox instead of
> geom_column, like:
> 
> SELECT geom_column FROM mytable WHERE bbox && setsrid('BOX3D(0 0,1
> 1)'::box3d,4326);
> 
> Of yourse, if you change or add rows to mytable, you have to keep the
> bbox "in sync". The most transparent way to do this would be triggers,
> but you also can modify your application to keep the bbox column current
> or run the UPDATE query above after every modification.
> 
> - snap -
> 
> The question is where to place it. I think adding it to paragraph 4.5.2
> in the doc would bloat it too much, so introducing a 4.5.3 might be the
> best.
> 
> Markus
> --
> markus schaber | dipl. informatiker
> logi-track ag | rennweg 14-16 | ch 8001 zürich
> phone +41-43-888 62 52 | fax +41-43-888 62 53
> mailto:schabios at logi-track.com | www.logi-track.com



> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users




More information about the postgis-users mailing list