[postgis-users] Overlap still not works

Andreas Laggner andreas.laggner at fal.de
Fri Nov 16 00:24:32 PST 2007


Hi,

I try to overlap huge files (up to 8 million polygons in a single file) 
since two months now. Two approaches were used, they both worked fine 
with small files (about some hundred polygons each file), but none ran 
through with the bigger ones! Right now i have two tables, the first one 
"sg7rep" with only 883396 rows and the second one "gembrd" with only 
15009 rows (these are the smallest ones from my sample).

The approach from Regina Obe. The problems always accur in the last part 
where we try to catch the remaining pieces from the intersection (that 
did not intersect) with the function difference. If i use 
ST_UNION(the_geom) i get this Error:
TopologyException: found non-noded intersection between 3.59058e+06 
5.25145e+06, 3.59058e+06 5.25145e+06 and 3.59058e+06 5.25145e+06, 
3.59058e+06 5.25145e+06 3.59058e+06 5.25145e+06
psql:/disk1/samba/PostGis/uba_sg7_gem_regina_stunion.sql:242: ERROR:  
GEOS union() threw an error!

I also tried to use st_buffer(st_collect(the_geom),0.0) instead of 
ST_UNION(the_geom) and the Server terminated abnormal. In both cases i 
receive exactly 981149 rows before the error accurs!

In the result you can see in evidence that the first three parts from 
this query worked fine: I have all polygons from both tables that do not 
intersect at all and i have the intersection parts from polygons that 
intersects - just the remaining pieces from the intersection are missing.

create table uba_sg7_gem_stunion
(sg7gid integer,
  sg7id integer,
  ffh_name character varying(80),
  ffh_land character varying(3),
  ffh smallint,
  ffh_id smallint,
  spa_name character varying(80),
  spa_land character varying(3),
  spa smallint,
  spa_id smallint,
  lsg_name character varying(80),
  lsg smallint,
  lsg_id smallint,
  nsg_name character varying(80),
  nsg smallint,
  nsg_id smallint,
  bio_name character varying(80),
  bio smallint,
  bio_id smallint,
  ntp_name character varying(80),
  ntp smallint,
  ntp_id smallint,
  np_name character varying(80),
  np smallint,
  np_id smallint,

  gemgid integer,
  gemid integer,
  idgem10 character(10),
  idgem8z integer,
  idgem8 character(8));

select 
addgeometrycolumn('','uba_sg7_gem_stunion','the_geom','31467','MULTIPOLYGON',2);
alter table uba_sg7_gem_stunion drop constraint enforce_geotype_the_geom;

INSERT INTO uba_sg7_gem_stunion

-- Get sg7rep polygons that do not intersect any gembrd
SELECT
sg7rep.gid,
sg7rep.sg7id,
sg7rep.ffh_name,
sg7rep.ffh_land,
sg7rep.ffh,
sg7rep.ffh_id,
sg7rep.spa_name,
sg7rep.spa_land,
sg7rep.spa,
sg7rep.spa_id,
sg7rep.lsg_name,
sg7rep.lsg,
sg7rep.lsg_id,
sg7rep.nsg_name,
sg7rep.nsg,
sg7rep.nsg_id,
sg7rep.bio_name,
sg7rep.bio,
sg7rep.bio_id,
sg7rep.ntp_name,
sg7rep.ntp,
sg7rep.ntp_id,
sg7rep.np_name,
sg7rep.np,
sg7rep.np_id,

null as gid,
null as gemid,
null as idgem10,
null as idgem8z,
null as idgem8,

sg7rep.the_geom

FROM sg7rep LEFT JOIN gembrd ON 
ST_Intersects(sg7rep.the_geom,gembrd.the_geom)
WHERE gembrd.the_geom is NULL
UNION

--Get gembrd polygons that do not intersect any sg7reps
SELECT
null as gid,
null as sg7id,
null as ffh_name,
null as ffh_land,
null as ffh,
null as ffh_id,
null as spa_name,
null as spa_land,
null as spa,
null as spa_id,
null as lsg_name,
null as lsg,
null as lsg_id,
null as nsg_name,
null as nsg,
null as nsg_id,
null as bio_name,
null as bio,
null as bio_id,
null as ntp_name,
null as ntp,
null as ntp_id,
null as np_name,
null as np,
null as np_id,

gembrd.gid,
gembrd.gemid,
gembrd.idgem10,
gembrd.idgem8z,
gembrd.idgem8,

gembrd.the_geom

FROM gembrd LEFT JOIN sg7rep ON 
ST_Intersects(sg7rep.the_geom,gembrd.the_geom)
WHERE sg7rep.the_geom IS NULL
UNION

--Get gembrd,sg7rep polygons that intersect, but only return that 
portion of
--the polygon that intersects
SELECT
sg7rep.gid,
sg7rep.sg7id,
sg7rep.ffh_name,
sg7rep.ffh_land,
sg7rep.ffh,
sg7rep.ffh_id,
sg7rep.spa_name,
sg7rep.spa_land,
sg7rep.spa,
sg7rep.spa_id,
sg7rep.lsg_name,
sg7rep.lsg,
sg7rep.lsg_id,
sg7rep.nsg_name,
sg7rep.nsg,
sg7rep.nsg_id,
sg7rep.bio_name,
sg7rep.bio,
sg7rep.bio_id,
sg7rep.ntp_name,
sg7rep.ntp,
sg7rep.ntp_id,
sg7rep.np_name,
sg7rep.np,
sg7rep.np_id,

gembrd.gid,
gembrd.gemid,
gembrd.idgem10,
gembrd.idgem8z,
gembrd.idgem8,

ST_Intersection(sg7rep.the_geom,gembrd.the_geom) As the_geom
FROM gembrd INNER JOIN sg7rep ON 
ST_Intersects(sg7rep.the_geom,gembrd.the_geom);

/**Get the remaining pieces - In theory all that should be left are 
parts of polygons
 that intersect but where these parts are not part of the intersecting 
regions
(e.g. we haven't accounted for them yet - which is
 why we are joining with our already built table). 
 I think you can do without the F_ID, S_IDs and group by which I am 
assuming
to be primary keys of S and F tables but I think your join will be 
slower since
you will be comparing against one big geometry.**/
INSERT INTO uba_sg7_gem_stunion
SELECT
sg7rep.gid,
sg7rep.sg7id,
sg7rep.ffh_name,
sg7rep.ffh_land,
sg7rep.ffh,
sg7rep.ffh_id,
sg7rep.spa_name,
sg7rep.spa_land,
sg7rep.spa,
sg7rep.spa_id,
sg7rep.lsg_name,
sg7rep.lsg,
sg7rep.lsg_id,
sg7rep.nsg_name,
sg7rep.nsg,
sg7rep.nsg_id,
sg7rep.bio_name,
sg7rep.bio,
sg7rep.bio_id,
sg7rep.ntp_name,
sg7rep.ntp,
sg7rep.ntp_id,
sg7rep.np_name,
sg7rep.np,
sg7rep.np_id,

null as gid,
null as gemid,
null as idgem10,
null as idgem8z,
null as idgem8,

ST_Difference(sg7rep.the_geom, uba_sg7_gem_stunion.the_geom) As the_geom
FROM sg7rep INNER JOIN (SELECT sg7id, ST_UNION(the_geom) As the_geom
    FROM uba_sg7_gem_stunion GROUP BY sg7id) As uba_sg7_gem_stunion ON 
sg7rep.sg7id = uba_sg7_gem_stunion.sg7id
    AND ST_Intersects(sg7rep.the_geom, uba_sg7_gem_stunion.the_geom)
WHERE 
ST_IsEmpty(ST_Difference(sg7rep.the_geom,uba_sg7_gem_stunion.the_geom)) 
= false
UNION

SELECT
null as gid,
null as sg7id,
null as ffh_name,
null as ffh_land,
null as ffh,
null as ffh_id,
null as spa_name,
null as spa_land,
null as spa,
null as spa_id,
null as lsg_name,
null as lsg,
null as lsg_id,
null as nsg_name,
null as nsg,
null as nsg_id,
null as bio_name,
null as bio,
null as bio_id,
null as ntp_name,
null as ntp,
null as ntp_id,
null as np_name,
null as np,
null as np_id,

gembrd.gid,
gembrd.gemid,
gembrd.idgem10,
gembrd.idgem8z,
gembrd.idgem8,

ST_Difference(gembrd.the_geom,uba_sg7_gem_stunion.the_geom) As the_geom
FROM gembrd INNER JOIN  (SELECT gemid, ST_UNION(the_geom) As the_geom
    FROM uba_sg7_gem_stunion GROUP BY gemid) As uba_sg7_gem_stunion ON 
gembrd.gemid = uba_sg7_gem_stunion.gemid
    AND ST_Intersects(gembrd.the_geom,uba_sg7_gem_stunion.the_geom)
WHERE ST_IsEmpty(ST_Difference(gembrd.the_geom, 
uba_sg7_gem_stunion.the_geom)) = false;


The second approach comes from Kevin 
(http://postgis.refractions.net/support/wiki/index.php?ExamplesOverlayTables). 
It runs for 24 hours now and still working on the first part (CREATE 
TABLE all_lines) - while Reginas approach finished with the error after 
2 hours. So even if Kevins approach will work it is much to slow!

Any ideas what else i can do to get my results???

cheers      Andreas


-- 
Dipl. Geoökologe Andreas Laggner
Institut für Ländliche Räume (LR)
Bundesforschungsanstalt für Landwirtschaft (FAL)

Institute of Rural Studies
Federal Agricultural Research Centre (FAL)

Bundesallee 50
D-38116 Braunschweig

Tel.: (+49) (0)531 596 5515
Fax: (+49) (0)531 596 5599
E-mail: andreas.laggner at fal.de
Homepage: http://www.lr.fal.de/ 




More information about the postgis-users mailing list