[postgis-users] Union of 7 datasets

Andreas Laggner andreas.laggner at fal.de
Thu Aug 30 05:00:50 PDT 2007


Hi regina,

Am i right when i think that this Mail only reaches you if i klick on 
"reply" and not as usual on "reply all"??
i just do not know how to clear my debt  :-)
so tell me if i can do something for you.......

Greetings      Andreas


Obe, Regina schrieb:
>
> Andreas,
>
> You would use the SQL UNION predicate like shown below (actually 
> slight correction - it is speedier to use UNION ALL especially when 
> you know there will not be dupiclates since it saves the processing of 
> sorting to get a distinct UNION does an implicit distinct)  - so I 
> have corrected below.
>
> -  - I happened to insert comments in between which may have confused 
> you, but you should be able to run the whole thing as one statement or 
> if you prefer because of speed issues run each insert separately.
>
> So  would be
> INSERT INTO sometable(field1, field2,field3,field4, newgeom)
>  SELECT udo.field1, udo.field2, kai.field3, kai.field4, 
> geomunion(udo.the_geom, kai.the_geom) AS newgeom
>   FROM udo INNER JOIN kai ON (udo.the_geom && kai.the_geom AND 
> intersects(udo.the_geom, kai.the_geom))
>   UNION ALL
>  -- the second select gives you udos that have no kais - your 15
>   SELECT udo.field1, udo.field2, NULL As field3, NULL As field4, 
> udo.the_geom AS newgeom
>   FROM udo LEFT JOIN kai ON (udo.the_geom && kai.the_geom AND 
> intersects(udo.the_geom, kai.the_geom))
>   WHERE kai.the_geom IS NULL
>  UNION ALL
>  -- and the 3rd gives you kais that have no udos. - your 40
>   SELECT NULL As field1, null As field2, kai.field3, kai.field4, 
> kai.the_geom AS newgeom
>   FROM kai LEFT JOIN udo ON (udo.the_geom && kai.the_geom AND 
> intersects(udo.the_geom, kai.the_geom))
>   WHERE udo.the_geom IS NULL;
>
>  
>
> ----If FULL JOIN were to work (which in theory it should, but doesn't 
> seem to with Postgis functions  (HINT HINT: would be nice if 
> that worked and can be easily fixed (but sadly I think the issue is 
> deeper than Postgis and Geos) - you could write the above much simpler as)
>
> INSERT INTO sometable(field1, field2,field3,field4, newgeom)
>  SELECT udo.field1, udo.field2, kai.field3, kai.field4, 
> COALESCE(geomunion(udo.the_geom, kai.the_geom), udo.the_geom, 
> kai.the_geom) AS newgeom
>   FROM udo FULL JOIN kai ON (udo.the_geom && kai.the_geom AND 
> intersects(udo.the_geom, kai.the_geom))
>
> Now if you have a lot of these and the tables are very similar in 
> nature and named in a predictable way then the way I usually handle it 
> is to write a pgsql function that dynamically generates the SQL 
> statement to execute either via a FOR loop and then executes the built 
> SQL or set of SQL statements or you could do a similar thing in some 
> scripted language like perl or php.
>
> I take it SQL and pgsql and all that is fairly new to you so it might 
> be worthwhile (even though its a lot of typing) to do it the long cut 
> and paste way if nothing more than an exercise to get a feel of how 
> this all works and visualize the patterns at play.
>
> I don't get the sense that I comprehend your full problem.   
> Unfortunately I don't have any experience with ArcGIS/ArcView ways of 
> doing things, so I'm not quite sure if there is an equivalent way in 
> PostGIS/PostgreSQL  world of doing the same kind of thing and what 
> exactly that thing is you are doing in ArcGIS.
>
> Union has 3 meanings in PostGIS/PostgreSQL (actually stuff your 
> favorite spatial/DB here - all non-trivial spatial relational dbs 
> behave more or less the same)  
>
> 1) unioning of record sets (standard ANSI SQL UNION/UNION ALL) - which 
> is simply a way of stringing together a bunch of selects into a 
> single result set as shown above
>
> 2)  unioning of 2 geometry fields like shown above with geomunion
>
> 3) Aggregate  geomunion - aggregate variant of the above geomunion 
> function that groups and unions a whole setof geometries together but 
> requires you are grouping by some field or set of fields).
>
> I must also mention there is collect (non-aggregate and aggregate 
> function) which often times is just as effective as the geomunion and 
> in general much faster processor wise.
>
> Hope that helps,
>
> Regina
>
> -----Original Message-----
> From: postgis-users-bounces at postgis.refractions.net 
> [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of 
> Andreas Laggner
> Sent: Tuesday, August 28, 2007 9:13 AM
> To: PostGIS Users Discussion
> Subject: Re: [postgis-users] Union of 7 datasets
>
> Obe, Regina schrieb:
> >  Sounds like you would have to go with a full join type thing with 
> workaround I described below (last example).  So If I understand you 
> correctly then something like this - 
> >
> > --the first select gives you those records in both tables your 25 udokai
> >  SELECT udo.field1, udo.field2, kai.field3, kai.field4, 
> geomunion(udo.the_geom, kai.the_geom) AS newgeom
> >  FROM udo INNER JOIN kai ON (udo.the_geom && kai.the_geom AND 
> intersects(udo.the_geom, kai.the_geom))
> >  UNION
> > -- the second select gives you udos that have no kais - your 15
> >  SELECT udo.field1, udo.field2, NULL As field3, NULL As field4, 
> udo.the_geom AS newgeom
> >  FROM udo LEFT JOIN kai ON (udo.the_geom && kai.the_geom AND 
> intersects(udo.the_geom, kai.the_geom))
> >  WHERE kai.the_geom IS NULL
> >  UNION
> > -- and the 3rd gives you kais that have no udos. - your 40
> >  SELECT nul As field1, null As field2, kai.field3, kai.field4, 
> kai.the_geom AS newgeom
> >  FROM kai LEFT JOIN udo ON (udo.the_geom && kai.the_geom AND 
> intersects(udo.the_geom, kai.the_geom))
> >  WHERE udo.the_geom IS NULL
> >
> > Unioned together you should get
> > 25 + 15 + 40 = ? 80
> >
> > Hope that helps,
> > Regina
> >
> >  
> UFF - that seems to be a complex question, a comprehensive operation and
> a lot to type too!!! Thanks a million - i would have spent many time to
> find out this query by myself! Ok..... how can i perform your "UNION": I
> would just insert my selects one after another in one new table - will
> that work well?
> I want to aggregate around 20 datasets this way!? Do you know a method
> to operate with more datasets (for example 7) in a effektiv manner and
> not to do this three selects 6 times?
> AND: Am i totally wrong with my aims? It seems to me that is an exotic
> think to do with PostGis, but our projects/problems require to aggregate
> datasets in almost all cases (around 90%) and that is exactly what one
> of the basic functions "Union" in ArcView already carried out ten years
> ago. So i would exspect there is a function to perform this operation
> easier!? .....i just wonder.....
>
>
> cheers      Andreas
>
> > -----Original Message-----
> > From: postgis-users-bounces at postgis.refractions.net 
> [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of 
> Andreas Laggner
> > Sent: Tuesday, August 28, 2007 7:28 AM
> > To: PostGIS Users Discussion
> > Subject: Re: [postgis-users] Union of 7 datasets
> >
> > Obe, Regina schrieb:
> >  
> >> BASIC TRICK:  If you want to get all records with no matching 
> including those that match - put what you would normally put in your 
> WHERE clause in the JOIN clause and use a LEFT JOIN.
> >>    
> > ok - it seems to me i need some coaching......Yes, I want to get all
> > records with no matching including those that match, but i also want to
> > dissect the polygons that matches. I will try to explain again just to
> > be sure you understood my aims: I have 20 polygons in udo and 50 in kai.
> > 10 from kai are intersecting 5 from udo to 25 new polygons i call udokai
> > (because they have attributes from udo AND kai). My result should have:
> > 15 polygons with attributes only from udo, 40 polygons with attributes
> > only from kai AND 25 with attributes from udo and kai! In most cases i
> > want to do such a operation because i want to aggregate (spatial
> > correct) different datasets!
> > Which example from you fits best for this aim? I did not find any
> > information on how left join works on the postgis or postgresql 
> reference...
> >  
> >> 
> >>
> >> Unfortunately as I have come across before if you need an either or 
> (if in table 1 or table 2 - ideally you would use a FULL JOIN but for 
> some reason Postgres chokes when you use postgis functions in the FULL 
> JOIN clause for the cases I have tried).  In that case you need a 
> workaround using a set of UNIONS.
> >> 
> --------------------------------------------------------------------------
> >> Simplest case - get all records in g1 one or union of g1 and g2 
> that intersect
> >> NOTE: COALESCE is an ANSI SQL function that will return the first 
> non-null - when you do a geomunion of a geometry and null you get null 
> which is why we need COALESCE
> >>
> >> SELECT g1.field1, g1.field2, COALESCE(geomunion(g1.the_geom, 
> g2.the_geom), g1.the_geom) AS newgeom
> >> FROM g1 LEFT JOIN g2 ON (g1.the_geom && g2.the_geom AND 
> intersects(g1.the_geom, g2.the_geom))
> >>
> >> 
> -----------------------------------------------------------------------------------------
> >> Either of case - get all geometries in g1 or g2 or union if there 
> is a match - workaround for full joins not working right
> >>
> >> SELECT g1.field1, g1.field2, geomunion(g1.the_geom, g2.the_geom) AS 
> newgeom
> >> FROM g1 INNER JOIN g2 ON (g1.the_geom && g2.the_geom AND 
> intersects(g1.the_geom, g2.the_geom))
> >> UNION
> >> SELECT g1.field1, g1.field2, g1.the_geom AS newgeom
> >> FROM g1 LEFT JOIN g2 ON (g1.the_geom && g2.the_geom AND 
> intersects(g1.the_geom, g2.the_geom))
> >> WHERE g2.the_geom IS NULL
> >> UNION
> >> SELECT g1.field1, g1.field2, g2.the_geom AS newgeom
> >> FROM g2 LEFT JOIN g1 ON (g1.the_geom && g2.the_geom AND 
> intersects(g1.the_geom, g2.the_geom))
> >> WHERE g1.the_geom IS NULL
> >>
> >>
> >>
> >> Hope that helps,
> >> Regina
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> -----Original Message-----
> >> From: postgis-users-bounces at postgis.refractions.net 
> [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of 
> Andreas Laggner
> >> Sent: Thursday, August 23, 2007 11:32 AM
> >> To: PostGIS Users Discussion
> >> Subject: Re: [postgis-users] Union of 7 datasets
> >>
> >> Obe, Regina schrieb:
> >>  
> >>    
> >>> Oh the g1 g2 .. was just for example - I don't actually call my 
> tables meaningless names like that. 
> >>>
> >>> You should be doing a join on something or have a where clause 
> unless one of your tables has only one record.  Otherwise you are 
> doing what is called a CROSS JOIN (cartesian product)  which gives you 
> an nxm records where n is the number of records in your first table 
> and m is the number in second table.  This is generally a big NO NO.  
> In certain rare cases you do want to do something like that, but is 
> usually the exception.
> >>>
> >>>  
> >>>    
> >>>      
> >> I think the records in my targed table must be added (more or less) and
> >> not multiplied! My Aim is a table that contains the areas of all the 7
> >> sourcetables and the information which refuges are inside and wich not.
> >> Perhaps i must use the intersection!? If i do my query with a gist like
> >> this: where t1.the_geom && t2.the_geom; than the operation is very fast
> >> (about one minute) but i only have the Polygons covered by BOTH
> >> datasets, and i want to have as well those, which are covered by one
> >> dataset only!! But my operation without the where clause runs for 4
> >> hours now - that shows me there is something wrong  ;-)
> >>  
> >>    
> >>> Its hard for me to tell if you need a cartesian product in this 
> case since I'm not quite sure what for example nature and biosphere 
> represent.  I would guess that is wrong and you should first figure 
> out which sets of say nature records you need to geomunion with 
> biosphere and then join by that field or set of fields.
> >>>
> >>> It would help a bit if you could provide some sample questions you 
> expect to answer with your statistical analysis.  My guess is you may 
> be better off with more than one table.
> >>>
> >>>  
> >>>    
> >>>      
> >> Sample question: give me all areas (all polygons) from germany where
> >> landuse=arable land and soils=good and precipitation>600 and any (of 7)
> >> reserves and so on.......
> >> I need the values in my table to calculate the potential yield or other
> >> things...
> >> And i want to analyse such questions with a statistical software (SAS),
> >> so it seems to me i need one table to import in SAS (or to query from
> >> SAS directly to the postgresql).
> >>
> >> Thanks for your help, i will be back in my office in 
> Monday.......Andreas
> >>
> >>  
> >>    
> >>> Which structure is best really boils down to what questions you 
> hope to answer because one approach may make one question easy and 
> fast and another question slow and cumbersome.
> >>>
> >>> Hope that helps,
> >>> Regina
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: postgis-users-bounces at postgis.refractions.net 
> [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of 
> Andreas Laggner
> >>> Sent: Thursday, August 23, 2007 10:04 AM
> >>> To: PostGIS Users Discussion
> >>> Subject: Re: [postgis-users] Union of 7 datasets
> >>>
> >>> Obe, Regina schrieb:
> >>>  
> >>>    
> >>>      
> >>>> Andreas,
> >>>>
> >>>> It would help to know what your table structure looks like and 
> why do you want to put them all in a single geometry?
> >>>>  
> >>>>    
> >>>>      
> >>>>        
> >>> My table structures are a little bit different. I want to have 
> them in a
> >>> single geometry to intersect them with other data and built a large
> >>> table to run statistics over it (production site analysis over 
> germany).
> >>>  
> >>>    
> >>>      
> >>>> I'm imaging you are you doing something like
> >>>>
> >>>> SELECT g1.somefield, geomunion(geomunion(g1.the_geom, 
> g2.the_geom), g3.the_geom)
> >>>> FROM g1 INNER JOIN g2 on g1.somefield = g2.somefield INNER JOIN 
> g3 on g2.somefield = g3.somefield
> >>>> GROUP BY g1.somefield
> >>>>
> >>>>  
> >>>>    
> >>>>      
> >>>>        
> >>> That´s an interesting method with inner join..why go you call your
> >>> tables g1. g2. and so on?
> >>> That´s my method i am using right now (geomunion 1 to 3 from 6), seems
> >>> to be a pedestrian method :-(
> >>>
> >>> create table natura2000
> >>>     (ffh_name character varying(80), ffh_land character 
> varying(3), ffh
> >>> smallint, ffh_id smallint,
> >>>      spa_name character varying(80), spa_land character 
> varying(3), spa
> >>> smallint, spa_id smallint) with oids;
> >>> select
> >>> 
> addgeometrycolumn('','natura2000','the_geom','31467','MULTIPOLYGON',2);
> >>> alter table natura2000 drop constraint enforce_geotype_the_geom;
> >>> insert into natura2000
> >>>     select
> >>> 
> t1.ffh_name,t1.ffh_land,t1.ffh,t1.ffh_id,t2.spa_name,t2.spa_land,t2.spa,t2.spa_id,
> >>>             geomunion(t1.the_geom, t2.the_geom)
> >>>             from ffh_rep t1, spa_rep t2;
> >>>
> >>> create table sg71
> >>>     (ffh_name character varying(80), ffh_land character 
> varying(3), ffh
> >>> smallint, ffh_id smallint,
> >>>      spa_name character varying(80), spa_land character 
> varying(3), spa
> >>> smallint, spa_id smallint,
> >>>      bio_name character varying(70), bio smallint, bio_id 
> smallint) with
> >>> oids;
> >>> select 
> addgeometrycolumn('','sg71','the_geom','31467','MULTIPOLYGON',2);
> >>> alter table sg71 drop constraint enforce_geotype_the_geom;
> >>> insert into sg71
> >>>     select t1.ffh_name, t1.ffh_land, t1.ffh, t1.ffh_id, t1.spa_name,
> >>> t1.spa_land, t1.spa, t1.spa_id,
> >>>             t2.name,t2.bio,t2.bio_id,geomunion(t1.the_geom, 
> t2.the_geom)
> >>>             from natura2000 t1, biosphere t2;
> >>>
> >>> create table sg72
> >>>        (ffh_name character varying(80), ffh_land character varying(3),
> >>> ffh smallint, ffh_id smallint,
> >>>     spa_name character varying(80), spa_land character varying(3), spa
> >>> smallint, spa_id smallint,
> >>>     bio_name character varying(70), bio smallint, bio_id smallint,
> >>>     np_name character varying(60), np smallint, np_id smallint) 
> with oids;
> >>> select 
> addgeometrycolumn('','sg72','the_geom','31467','MULTIPOLYGON',2);
> >>> alter table sg72 drop constraint enforce_geotype_the_geom;
> >>> insert into sg72
> >>>     select t1.ffh_name, t1.ffh_land, t1.ffh, t1.ffh_id, t1.spa_name,
> >>> t1.spa_land, t1.spa, t1.spa_id,
> >>>             t1.bio_name,t1.bio,t1.bio_id,t2.np_name,t2.np,t2.np_id,
> >>>             geomunion(t1.the_geom, t2.the_geom)
> >>>             from sg71 t1, np t2;
> >>> AND SO ON......
> >>>  
> >>>    
> >>>      
> >>>> or
> >>>>
> >>>> SELECT g1.somefield, geomunion(gt.the_geom)
> >>>> FROM (SELECT somefield, the_geom FROM g1 UNION SELECT somefield, 
> the_geom FROM g2 ...) gt
> >>>> GROUP BY gt.somefield
> >>>>
> >>>>
> >>>> If I have 7 different tables that have pretty much the same 
> structure, but for logistical or other technical reasons (such as each 
> has additional attributes distinct from one another), I need to keep 
> them as separate tables, then I usually use inherited tables for that. 
> That way when I need to join all datasets at once, I can simply query 
> the parent table and it will automatically drill down to the child 
> tables. Not sure if that helps more than it confuses your situation.
> >>>>
> >>>> Then instead of the above I can simply do
> >>>>
> >>>> SELEG myparenttable.somefield, geomunion(myparenttable.the_geom)
> >>>> FROM myparenttable
> >>>> GROUP by gh.somefield
> >>>>
> >>>>
> >>>>  
> >>>>    
> >>>>      
> >>>>        
> >>> ok - i have to think about your suggestions......that´s my second week
> >>> with postgis.
> >>> Can you tell me from my SQL-Statements which method will be best? So i
> >>> try to understand that one.....
> >>>
> >>> Thanks for your reply!!!
> >>>  
> >>>    
> >>>      
> >>>> Hope that helps,
> >>>> Regina
> >>>>
> >>>> -----Original Message-----
> >>>> From: postgis-users-bounces at postgis.refractions.net 
> [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of 
> Andreas Laggner
> >>>> Sent: Thursday, August 23, 2007 9:11 AM
> >>>> To: PostGis_Mailinglist
> >>>> Subject: [postgis-users] Union of 7 datasets
> >>>>
> >>>> Hi users,
> >>>>
> >>>> i want to put together 7 datasets to have all the different 
> refuges in
> >>>> one table (and in one geometry). Am i doing right with 6 times 
> geomunion
> >>>> (that´s much to type with all the attributes) or is there a more
> >>>> effective way?
> >>>>
> >>>> cheers Andreas
> >>>>
> >>>>  
> >>>>    
> >>>>      
> >>>>        
> >>>  
> >>>    
> >>>      
> >>  
> >>    
> >
> >
> >  
>
>
> --
> Dipl. Geoökologe Andreas Laggner
> Institut für Ländliche Räume (LR)
> Bundesforschungsanstalt für Landwirtschaft (FAL)
>
> Institute of Rural Studies
> Federal Agricultural Research Centre (FAL)
>
> Bundesallee 50
> D-38116 Braunschweig
>
> Tel.: (+49) (0)531 596 5515
> Fax: (+49) (0)531 596 5599
> E-mail: andreas.laggner at fal.de
> Homepage: http://www.lr.fal.de/
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
> ------------------------------------------------------------------------
>
> *The substance of this message, including any attachments, may be 
> confidential, legally privileged and/or exempt from disclosure 
> pursuant to Massachusetts law. It is intended solely for the 
> addressee. If you received this in error, please contact the sender 
> and delete the material from any computer. *
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>   


-- 
Dipl. Geoökologe Andreas Laggner
Institut für Ländliche Räume (LR)
Bundesforschungsanstalt für Landwirtschaft (FAL)

Institute of Rural Studies
Federal Agricultural Research Centre (FAL)

Bundesallee 50
D-38116 Braunschweig

Tel.: (+49) (0)531 596 5515
Fax: (+49) (0)531 596 5599
E-mail: andreas.laggner at fal.de
Homepage: http://www.lr.fal.de/ 




More information about the postgis-users mailing list