[postgis-users] How to improve performance of ST_Within in a small table with large geometries

Paul Ramsey pramsey at opengeo.org
Wed May 13 08:08:17 PDT 2009


Make the big things smaller.

- Break the multi-polygons into single polygons
- Cut the single polygons up into even smaller things using a grid
- Parallelize by having one query check for containment with one
subset of polygons and others check for containment on a
second/third/fourth (this approach might break down if you get I/O
bound and start thrashing, but to the extent that the operation
remains CPU bound it will help)

P.

2009/5/13 Jorge Arévalo <jorge.arevalo at gmail.com>:
> Hello,
>
> As I said in these threads:
>
> http://postgis.refractions.net/pipermail/postgis-users/2009-May/023369.html
> http://postgis.refractions.net/pipermail/postgis-users/2009-May/023415.html
>
> We're working with a table of large geometries (multipolygons) and small
> number of rows. We want to check in which multipolygon of this table are the
> points of another table. This another table has ~8 million of points. The
> SQL code is:
>
> --loop for all polygons. T1 has polygons, but we use the center, a point, to
> do the Within
> for i in select ogc_fid from T1 LOOP
>     update T1
>     set field_to_update = (select updating_field  from T2 where
> ST_Within(ST_Centroid(T1.wkb_geometry), the_geom) LIMIT 1 OFFSET 0)
>     where ogc_fid = i;
> END LOOP
> --end of loop
>
> (We have to update one table with a field of the another one, but the
> important part is the ST_Within part)
>
> In the worst case (point is in the last multipolygon) the query takes 1sec.
> In the best case, takes 16msec. If we take into account the worst case only
> (this is, all the queries will take 1sec), we're talking about 93 days... We
> need to improve the performance of this.  So, we tried:
>
> - Force using index instead of sequential scan (SET enable_seqscan TO off).
> It didn't improve the speed
> - Use bounding box of multipolygons instead the multipolygons (ST_Envelope).
> We can't do this because bounding box doesn't take into account
> InnerBoundaries inside multipolygons, and we have multipolygons with "holes"
> filled of other multipolygons
> - Clustering of indexes. It didn't increase the speed
> - Force_2D(). It didn't increase the speed significantly
>
> We're using PostGIS 1.3.5 with GEOS and PROJ support, over PostgreSQL 8.1.1.
> Our operating system is Suse Linux Enterprise. The hardware is an Intel XEON
> 2.5 GHZ (64 bits) and 4GB RAM.
>
> How could we improve the performance of this? 93 days are too much time. Is
> it possible to take advantage of the 4-core processor, for example?
>
> Thanks in advance
> Best regards
>
> Jorge
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
>



More information about the postgis-users mailing list