[postgis-users] Query optimization using ST_DWithin
"João G."
joaofgo at gmail.com
Mon Jul 6 15:28:55 PDT 2009
Hi!
I’m using a land use/land cover dataset with roughly 93000 polygons for
purposes of ecological research with the following query:
SELECT a.gid AS a_gid,
b.gid AS b_gid,
b.gridcode AS b_class,
st_area(b.the_geom) AS b_area,
st_perimeter(b.the_geom) AS b_perimeter,
st_distance(a.the_geom,b.the_geom) AS patch_dist,
st_distance(st_centroid(a.the_geom),st_centroid(b.the_geom)) AS c2c_dist
INTO public.cantanhez03_ftable_fd0100
FROM public.lulc_cantanhez_c03 AS a,
public.lulc_cantanhez_c03 AS b
WHERE st_dwithin(a.the_geom, b.the_geom, 100) AND a.gid <> b.gid
ORDER BY a_gid, b_gid, b_class
The general idea is to retrieve for each patch/polygon the set of
patches that are within a range of 100 m excluding the initial/focal
patch from the analysis (i.e.: a.gid <> b.gid) and some area, perimeter
and distance measurements. The geometry column is indexed with GisT
(CREATE INDEX "lulc_cantanhez_c03_idx1" ON "public"."lulc_cantanhez_c03"
USING gist ("the_geom" "public"."gist_geometry_ops");). Some geometries
extend throughout the area with a large number of vertices (84121,
25214, 9873, 9485, 8991, 7340, 5219, 5066, 4653, 4439, 4275, 4267, 3620,
3556, 3288,…) which translates in this cases into a high number of
vicinal patches. I’m using POSTGIS="1.3.6" GEOS="3.1.0-CAPI-1.5.0"
PROJ="Rel. 4.6.1, 21 August 2008" USE_STATS; PostgreSQL 8.2.13 on
i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2 (mingw-special) on
a Win32 XP Pro development machine.
This query can take 12 to 15H on a Pentium D 3.2GHz with 2 GB RAM to
fully process the dataset. Is there a way to optimize/improve this? What
should be the approach: server parameters optimization, query redesign,
tilling, sub-setting?
Thanks in advance:
João G.
More information about the postgis-users
mailing list