[postgis-users] ST_Difference lots of lines with lots of polygons

Kevin Neufeld kneufeld at refractions.net
Fri Dec 18 20:52:52 PST 2009


Hmm.  Another use case for having topology in PostGIS ... if only :)

Well, I think my approach would be to iterate through your polygons and 
do as you suggest, use ST_Difference, but use a collection of all the 
lines that overlap the polygon and the polygon as parameters.

1. First identity all the lines that potentially overlap a polygon
CREATE TABLE overlapping_contours AS
SELECT a.id AS poly_id, b.id AS cont_id, b.geom
FROM poly a, contours b
WHERE a.geom && b.geom;

2. Remove all original contour lines so you don't get duplicates later
DELETE FROM contours WHERE id IN (SELECT cont_id FROM overlapping_contours);

3. Create multilinestring of the contours around every polygon (to be 
used as input into ST_Difference)
CREATE TABLE grouped_contours AS
SELECT poly_id, ST_Collect(geom) AS geom
FROM overlapping_contours
GROUP BY poly_id

4. Subtract the polys from the multilinestrings
UPDATE grouped_contours a
SET geom = ST_Difference(a.geom, b.geom)
FROM polys b
WHERE a.poly_id = b.id;

5. Expand and insert the mlines back into the contours table
INSERT INTO contours (geom)
SELECT (ST_Dump(geom)).geom
FROM grouped_contours;

None of this code is tested, and obviously you'll need to add indexes to 
your intermediate tables, but it might be a start.  Note that PostgreSQL 
does not use more than one CPU at a time per query.  I could see this 
taking quite a long time.  To speed things up, you might try 
partitioning your polygons into 4 logical groups and run things 
concurrently.  IE, in step one, add the filter AND a.id < 50000.  Then, 
at the same time, run another query using AND a.id >=50000 AND a.id < 
100000.  Just a thought.

Cheers,
Kevin

Christy Nieman wrote:
> Hello,
>
> I have been having difficulty figuring out how I might go about doing 
> a difference of many lines and polygons.  For context, I have many (> 
> 3,000,000) contour lines that I want to difference based on many (> 
> 180,000) lakes.  My goal is to remove the sections of the lines that 
> are within the polygons.  I know that I can use ST_Difference(aLine, 
> aPolygon) to do this for one feature, but, what would the syntax look 
> like to go though all my line and all my polygons and return just the 
> lines/parts of lines that are outside of the polygons (and do this in 
> as quickly and resource-friendly way as possible)?  The machine I am 
> working on is a relatively well-powered desktop (quad core, 6GB RAM).
>
> Thanks for any insight/suggestions,
>
> Christy



More information about the postgis-users mailing list