[postgis-users] geomunion revisited....

Kevin Neufeld kneufeld at refractions.net
Mon Dec 17 13:58:43 PST 2007


Very nice, Craig.  This is definitely the better approach when 
attempting to union a large set of geometries instead of the naive 
aggregate approach st_union currently employs.  Hopefully we can add 
something similar in PostGIS soon.
Cheers,
-- Kevin

-------------
Kevin Neufeld
Software Developer
Refractions Research Inc.
300-1207 Douglas St.
Victoria, B.C., V8W 2E7

Phone: (250) 383-3022
Email: kneufeld at refractions.net


Craig de Stigter wrote:
> I had to Union geometries with a high degree of overlap, and the 
> ST_Union function was far too slow (union on 18,000 polygons took over 
> two hours.)
>
> So I wrote a pl/pgsql implementation of CascadedUnion, which will tide 
> me over till PostGIS gets its own CascadedUnion.
>
> The first function is a convenience function that just discovers the 
> initial extent of the table and calls the second function.
> The second function is recursive and splits the extent until there are 
> fewer than a certain number of points, calling ST_Union on such subsets.
>
> So for example:
>
>     SELECT CascadedUnion('public', 'mytable', 'the_geom', 1000);
>
> will split the bounding box recursively until there are fewer than 
> 1000 geometries in each box, and call ST_Union on each of these subsets.
>
> For my (highly overlapping) data set, this does in 18 seconds what a 
> plain ST_Union took more than two hours to do. For non-overlapping 
> datasets it will work out slower.
>
> Heres the code:
>
>  
>
>     CREATE OR REPLACE FUNCTION CascadedUnion(character varying,
>     character varying, character varying, integer) 
>
>     RETURNS geometry AS
>     $BODY$
>     DECLARE
>         curs1 refcursor;
>         bounds BOX2D;
>     BEGIN
>         OPEN curs1 FOR EXECUTE 'SELECT extent(' || $3 || ') FROM ' ||
>     $1 || '.' || $2;
>         FETCH curs1 INTO bounds;
>         RETURN CascadedUnion($1, $2, $3, $4, bounds);
>     END;
>     $BODY$
>     LANGUAGE 'plpgsql' VOLATILE;
>
>     CREATE OR REPLACE FUNCTION CascadedUnion(character varying,
>     character varying, character varying, integer, box3d)
>     RETURNS geometry AS
>     $BODY$
>     DECLARE
>         end_geom GEOMETRY;
>         bounds BOX3D;
>         bounds1 BOX3D;
>         bounds2 BOX3D;
>         numfeatures integer;
>         curs1 refcursor;
>         curs2 refcursor;
>         max_x real;
>         max_y real;
>         min_x real;
>         min_y real;
>         result1 geometry;
>         result2 geometry;
>         emptygeom geometry;
>     BEGIN
>         max_x := xmax($5);
>         min_x := xmin($5);
>         max_y := ymax($5);
>         min_y := ymin($5);
>
>         select into bounds $5;
>         OPEN curs1 FOR EXECUTE 'SELECT count(*) FROM ' || $1 || '.' ||
>     $2 || ' WHERE ' || $3 || ' && setsrid(box2d(''BOX(' || min_x || '
>     ' || min_y || ',' || max_x || ' ' || max_y || ')''), srid(' || $3
>     || ')) AND intersects(centroid(' || $3 || '),
>     setsrid(box2d(''BOX(' || min_x || ' ' || min_y || ',' || max_x ||
>     ' ' || max_y || ')''), srid(' || $3 || ')))';
>         FETCH curs1 INTO numfeatures;
>
>         IF numfeatures < $4 THEN
>             OPEN curs2 FOR EXECUTE 'select st_union(' || $3 || ') from
>     ' || $1 || '.' || $2 || ' where ' || $3 || ' &&
>     setsrid(box2d(''BOX(' || min_x || ' ' || min_y || ',' || max_x ||
>     ' ' || max_y || ')''), srid(' || $3 || ')) AND
>     intersects(centroid(' || $3 || '), setsrid(box2d(''BOX(' || min_x
>     || ' ' || min_y || ',' || max_x || ' ' || max_y || ')''), srid('
>     || $3 || ')))';
>             FETCH curs2 INTO end_geom;
>         ELSE
>             --split in longest dimension
>             IF max_x - min_x > max_y - min_y THEN
>                 bounds1 := makebox3d(makepoint(min_x, min_y),
>     makepoint((max_x+min_x)/2, max_y));
>                 bounds2 := makebox3d(makepoint((max_x + min_x)/2,
>     min_y), makepoint(max_x, max_y));
>             ELSE
>                 bounds1 := makebox3d(makepoint(min_x, min_y),
>     makepoint(max_x, (max_y+min_y)/2));
>                 bounds2 := makebox3d(makepoint(min_x,
>     (max_y+min_y)/2), makepoint(max_x, max_y));
>             END IF;
>
>             --recurse
>             SELECT INTO result1 CascadedUnion($1, $2, $3, $4, bounds1);
>             SELECT INTO result2 CascadedUnion($1, $2, $3, $4, bounds2);
>             SELECT INTO emptygeom
>     geomfromtext('GEOMETRYCOLLECTION(EMPTY)', find_srid($1, $2, $3));
>             SELECT INTO end_geom st_union(coalesce(result1,
>     emptygeom), coalesce(result2, emptygeom));
>         END IF;
>         RETURN end_geom;
>
>     END;
>     $BODY$
>     LANGUAGE 'plpgsql' VOLATILE;
>
>
>
> Regards
> Craig
>
> One Track Mind Ltd.
> PO Box 1604, Shortland St, Auckland, New Zealand
> Phone +64-9-966 0433 Fax +64-9-969 0045
> Web http://www.onetrackmind.co.nz
> ------------------------------------------------------------------------
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>   



More information about the postgis-users mailing list