[postgis-devel] Implement New Geometry Index Method?

Sandro Santilli strk at keybit.net
Wed May 21 02:13:10 PDT 2014


On Wed, May 21, 2014 at 09:02:57PM +1200, Jeremy Palmer wrote:
> Hi PostGIS Devs,
> 
> I have a daily process that I would like to optimise. The process includes a spatial intersects join between a small administration boundaries table for New Zealand that contains large geometries (100,000+ points per polygon) and a large dataset with small line strings.
> 
> The first iteration of SQL I developed for the process I used standard join (WHERE ST_Interesects(admin.geom, other.geom)) and it took 4 hours to complete. No spatial index was used. Not acceptable.
> 
> Based on a recommendation from the PostGIS manual (http://postgis.net/docs/manual-1.5/ch06.html#id368055) I then created a bounding box column for each of the admin boundaries, built an index for this column and managed to reduce the processing time to 40mins. Spatial index was used.  This was still too slow.
> 
> I then found an article here http://www.kindle-maps.com/blog/splitting-a-large-polygon-into-a-grid.html and split the large admin boundaries into grid cells (5kmx5km) then created a spatial join query using the new gridded boundaries. This resulted in a query time 20secs. Much better! One of the side effects was the gridding process took about 10mins to generate. However the admin boundaries only change once a year, so it's not a large problem.
> 
> On reflection this gridding processing seems like a replication of a core database feature - indexing. Would it not be best to implement a new index method to do this type of spatial grid indexing? This would allow for simplified creation, maintenance, size and access(?). It would also provide a great core feature for PostGIS. If PostGIS Devs think this is a good idea I have funding to make this happen.

I agree it would be good to havea "gridding" function in PostGIS, but I don't
really see it as being part of "indexing". Gridding produces better indexes
but also augment the number of records so it takes some user design to decide
how to layout the data.

One thing to keep in mind is that splitting on a regular grid would
necessarely introduce new vertices which in turn would slighly move the
original edges and possibly introduce collapses/invalidities. 

--strk;



More information about the postgis-devel mailing list