[postgis-users] Union of 7 datasets
Andreas Laggner
andreas.laggner at fal.de
Thu Aug 30 05:00:50 PDT 2007
Hi regina,
Am i right when i think that this Mail only reaches you if i klick on
"reply" and not as usual on "reply all"??
i just do not know how to clear my debt :-)
so tell me if i can do something for you.......
Greetings Andreas
Obe, Regina schrieb:
>
> Andreas,
>
> You would use the SQL UNION predicate like shown below (actually
> slight correction - it is speedier to use UNION ALL especially when
> you know there will not be dupiclates since it saves the processing of
> sorting to get a distinct UNION does an implicit distinct) - so I
> have corrected below.
>
> - - I happened to insert comments in between which may have confused
> you, but you should be able to run the whole thing as one statement or
> if you prefer because of speed issues run each insert separately.
>
> So would be
> INSERT INTO sometable(field1, field2,field3,field4, newgeom)
> SELECT udo.field1, udo.field2, kai.field3, kai.field4,
> geomunion(udo.the_geom, kai.the_geom) AS newgeom
> FROM udo INNER JOIN kai ON (udo.the_geom && kai.the_geom AND
> intersects(udo.the_geom, kai.the_geom))
> UNION ALL
> -- the second select gives you udos that have no kais - your 15
> SELECT udo.field1, udo.field2, NULL As field3, NULL As field4,
> udo.the_geom AS newgeom
> FROM udo LEFT JOIN kai ON (udo.the_geom && kai.the_geom AND
> intersects(udo.the_geom, kai.the_geom))
> WHERE kai.the_geom IS NULL
> UNION ALL
> -- and the 3rd gives you kais that have no udos. - your 40
> SELECT NULL As field1, null As field2, kai.field3, kai.field4,
> kai.the_geom AS newgeom
> FROM kai LEFT JOIN udo ON (udo.the_geom && kai.the_geom AND
> intersects(udo.the_geom, kai.the_geom))
> WHERE udo.the_geom IS NULL;
>
>
>
> ----If FULL JOIN were to work (which in theory it should, but doesn't
> seem to with Postgis functions (HINT HINT: would be nice if
> that worked and can be easily fixed (but sadly I think the issue is
> deeper than Postgis and Geos) - you could write the above much simpler as)
>
> INSERT INTO sometable(field1, field2,field3,field4, newgeom)
> SELECT udo.field1, udo.field2, kai.field3, kai.field4,
> COALESCE(geomunion(udo.the_geom, kai.the_geom), udo.the_geom,
> kai.the_geom) AS newgeom
> FROM udo FULL JOIN kai ON (udo.the_geom && kai.the_geom AND
> intersects(udo.the_geom, kai.the_geom))
>
> Now if you have a lot of these and the tables are very similar in
> nature and named in a predictable way then the way I usually handle it
> is to write a pgsql function that dynamically generates the SQL
> statement to execute either via a FOR loop and then executes the built
> SQL or set of SQL statements or you could do a similar thing in some
> scripted language like perl or php.
>
> I take it SQL and pgsql and all that is fairly new to you so it might
> be worthwhile (even though its a lot of typing) to do it the long cut
> and paste way if nothing more than an exercise to get a feel of how
> this all works and visualize the patterns at play.
>
> I don't get the sense that I comprehend your full problem.
> Unfortunately I don't have any experience with ArcGIS/ArcView ways of
> doing things, so I'm not quite sure if there is an equivalent way in
> PostGIS/PostgreSQL world of doing the same kind of thing and what
> exactly that thing is you are doing in ArcGIS.
>
> Union has 3 meanings in PostGIS/PostgreSQL (actually stuff your
> favorite spatial/DB here - all non-trivial spatial relational dbs
> behave more or less the same)
>
> 1) unioning of record sets (standard ANSI SQL UNION/UNION ALL) - which
> is simply a way of stringing together a bunch of selects into a
> single result set as shown above
>
> 2) unioning of 2 geometry fields like shown above with geomunion
>
> 3) Aggregate geomunion - aggregate variant of the above geomunion
> function that groups and unions a whole setof geometries together but
> requires you are grouping by some field or set of fields).
>
> I must also mention there is collect (non-aggregate and aggregate
> function) which often times is just as effective as the geomunion and
> in general much faster processor wise.
>
> Hope that helps,
>
> Regina
>
> -----Original Message-----
> From: postgis-users-bounces at postgis.refractions.net
> [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of
> Andreas Laggner
> Sent: Tuesday, August 28, 2007 9:13 AM
> To: PostGIS Users Discussion
> Subject: Re: [postgis-users] Union of 7 datasets
>
> Obe, Regina schrieb:
> > Sounds like you would have to go with a full join type thing with
> workaround I described below (last example). So If I understand you
> correctly then something like this -
> >
> > --the first select gives you those records in both tables your 25 udokai
> > SELECT udo.field1, udo.field2, kai.field3, kai.field4,
> geomunion(udo.the_geom, kai.the_geom) AS newgeom
> > FROM udo INNER JOIN kai ON (udo.the_geom && kai.the_geom AND
> intersects(udo.the_geom, kai.the_geom))
> > UNION
> > -- the second select gives you udos that have no kais - your 15
> > SELECT udo.field1, udo.field2, NULL As field3, NULL As field4,
> udo.the_geom AS newgeom
> > FROM udo LEFT JOIN kai ON (udo.the_geom && kai.the_geom AND
> intersects(udo.the_geom, kai.the_geom))
> > WHERE kai.the_geom IS NULL
> > UNION
> > -- and the 3rd gives you kais that have no udos. - your 40
> > SELECT nul As field1, null As field2, kai.field3, kai.field4,
> kai.the_geom AS newgeom
> > FROM kai LEFT JOIN udo ON (udo.the_geom && kai.the_geom AND
> intersects(udo.the_geom, kai.the_geom))
> > WHERE udo.the_geom IS NULL
> >
> > Unioned together you should get
> > 25 + 15 + 40 = ? 80
> >
> > Hope that helps,
> > Regina
> >
> >
> UFF - that seems to be a complex question, a comprehensive operation and
> a lot to type too!!! Thanks a million - i would have spent many time to
> find out this query by myself! Ok..... how can i perform your "UNION": I
> would just insert my selects one after another in one new table - will
> that work well?
> I want to aggregate around 20 datasets this way!? Do you know a method
> to operate with more datasets (for example 7) in a effektiv manner and
> not to do this three selects 6 times?
> AND: Am i totally wrong with my aims? It seems to me that is an exotic
> think to do with PostGis, but our projects/problems require to aggregate
> datasets in almost all cases (around 90%) and that is exactly what one
> of the basic functions "Union" in ArcView already carried out ten years
> ago. So i would exspect there is a function to perform this operation
> easier!? .....i just wonder.....
>
>
> cheers Andreas
>
> > -----Original Message-----
> > From: postgis-users-bounces at postgis.refractions.net
> [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of
> Andreas Laggner
> > Sent: Tuesday, August 28, 2007 7:28 AM
> > To: PostGIS Users Discussion
> > Subject: Re: [postgis-users] Union of 7 datasets
> >
> > Obe, Regina schrieb:
> >
> >> BASIC TRICK: If you want to get all records with no matching
> including those that match - put what you would normally put in your
> WHERE clause in the JOIN clause and use a LEFT JOIN.
> >>
> > ok - it seems to me i need some coaching......Yes, I want to get all
> > records with no matching including those that match, but i also want to
> > dissect the polygons that matches. I will try to explain again just to
> > be sure you understood my aims: I have 20 polygons in udo and 50 in kai.
> > 10 from kai are intersecting 5 from udo to 25 new polygons i call udokai
> > (because they have attributes from udo AND kai). My result should have:
> > 15 polygons with attributes only from udo, 40 polygons with attributes
> > only from kai AND 25 with attributes from udo and kai! In most cases i
> > want to do such a operation because i want to aggregate (spatial
> > correct) different datasets!
> > Which example from you fits best for this aim? I did not find any
> > information on how left join works on the postgis or postgresql
> reference...
> >
> >>
> >>
> >> Unfortunately as I have come across before if you need an either or
> (if in table 1 or table 2 - ideally you would use a FULL JOIN but for
> some reason Postgres chokes when you use postgis functions in the FULL
> JOIN clause for the cases I have tried). In that case you need a
> workaround using a set of UNIONS.
> >>
> --------------------------------------------------------------------------
> >> Simplest case - get all records in g1 one or union of g1 and g2
> that intersect
> >> NOTE: COALESCE is an ANSI SQL function that will return the first
> non-null - when you do a geomunion of a geometry and null you get null
> which is why we need COALESCE
> >>
> >> SELECT g1.field1, g1.field2, COALESCE(geomunion(g1.the_geom,
> g2.the_geom), g1.the_geom) AS newgeom
> >> FROM g1 LEFT JOIN g2 ON (g1.the_geom && g2.the_geom AND
> intersects(g1.the_geom, g2.the_geom))
> >>
> >>
> -----------------------------------------------------------------------------------------
> >> Either of case - get all geometries in g1 or g2 or union if there
> is a match - workaround for full joins not working right
> >>
> >> SELECT g1.field1, g1.field2, geomunion(g1.the_geom, g2.the_geom) AS
> newgeom
> >> FROM g1 INNER JOIN g2 ON (g1.the_geom && g2.the_geom AND
> intersects(g1.the_geom, g2.the_geom))
> >> UNION
> >> SELECT g1.field1, g1.field2, g1.the_geom AS newgeom
> >> FROM g1 LEFT JOIN g2 ON (g1.the_geom && g2.the_geom AND
> intersects(g1.the_geom, g2.the_geom))
> >> WHERE g2.the_geom IS NULL
> >> UNION
> >> SELECT g1.field1, g1.field2, g2.the_geom AS newgeom
> >> FROM g2 LEFT JOIN g1 ON (g1.the_geom && g2.the_geom AND
> intersects(g1.the_geom, g2.the_geom))
> >> WHERE g1.the_geom IS NULL
> >>
> >>
> >>
> >> Hope that helps,
> >> Regina
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> -----Original Message-----
> >> From: postgis-users-bounces at postgis.refractions.net
> [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of
> Andreas Laggner
> >> Sent: Thursday, August 23, 2007 11:32 AM
> >> To: PostGIS Users Discussion
> >> Subject: Re: [postgis-users] Union of 7 datasets
> >>
> >> Obe, Regina schrieb:
> >>
> >>
> >>> Oh the g1 g2 .. was just for example - I don't actually call my
> tables meaningless names like that.
> >>>
> >>> You should be doing a join on something or have a where clause
> unless one of your tables has only one record. Otherwise you are
> doing what is called a CROSS JOIN (cartesian product) which gives you
> an nxm records where n is the number of records in your first table
> and m is the number in second table. This is generally a big NO NO.
> In certain rare cases you do want to do something like that, but is
> usually the exception.
> >>>
> >>>
> >>>
> >>>
> >> I think the records in my targed table must be added (more or less) and
> >> not multiplied! My Aim is a table that contains the areas of all the 7
> >> sourcetables and the information which refuges are inside and wich not.
> >> Perhaps i must use the intersection!? If i do my query with a gist like
> >> this: where t1.the_geom && t2.the_geom; than the operation is very fast
> >> (about one minute) but i only have the Polygons covered by BOTH
> >> datasets, and i want to have as well those, which are covered by one
> >> dataset only!! But my operation without the where clause runs for 4
> >> hours now - that shows me there is something wrong ;-)
> >>
> >>
> >>> Its hard for me to tell if you need a cartesian product in this
> case since I'm not quite sure what for example nature and biosphere
> represent. I would guess that is wrong and you should first figure
> out which sets of say nature records you need to geomunion with
> biosphere and then join by that field or set of fields.
> >>>
> >>> It would help a bit if you could provide some sample questions you
> expect to answer with your statistical analysis. My guess is you may
> be better off with more than one table.
> >>>
> >>>
> >>>
> >>>
> >> Sample question: give me all areas (all polygons) from germany where
> >> landuse=arable land and soils=good and precipitation>600 and any (of 7)
> >> reserves and so on.......
> >> I need the values in my table to calculate the potential yield or other
> >> things...
> >> And i want to analyse such questions with a statistical software (SAS),
> >> so it seems to me i need one table to import in SAS (or to query from
> >> SAS directly to the postgresql).
> >>
> >> Thanks for your help, i will be back in my office in
> Monday.......Andreas
> >>
> >>
> >>
> >>> Which structure is best really boils down to what questions you
> hope to answer because one approach may make one question easy and
> fast and another question slow and cumbersome.
> >>>
> >>> Hope that helps,
> >>> Regina
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: postgis-users-bounces at postgis.refractions.net
> [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of
> Andreas Laggner
> >>> Sent: Thursday, August 23, 2007 10:04 AM
> >>> To: PostGIS Users Discussion
> >>> Subject: Re: [postgis-users] Union of 7 datasets
> >>>
> >>> Obe, Regina schrieb:
> >>>
> >>>
> >>>
> >>>> Andreas,
> >>>>
> >>>> It would help to know what your table structure looks like and
> why do you want to put them all in a single geometry?
> >>>>
> >>>>
> >>>>
> >>>>
> >>> My table structures are a little bit different. I want to have
> them in a
> >>> single geometry to intersect them with other data and built a large
> >>> table to run statistics over it (production site analysis over
> germany).
> >>>
> >>>
> >>>
> >>>> I'm imaging you are you doing something like
> >>>>
> >>>> SELECT g1.somefield, geomunion(geomunion(g1.the_geom,
> g2.the_geom), g3.the_geom)
> >>>> FROM g1 INNER JOIN g2 on g1.somefield = g2.somefield INNER JOIN
> g3 on g2.somefield = g3.somefield
> >>>> GROUP BY g1.somefield
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>> That´s an interesting method with inner join..why go you call your
> >>> tables g1. g2. and so on?
> >>> That´s my method i am using right now (geomunion 1 to 3 from 6), seems
> >>> to be a pedestrian method :-(
> >>>
> >>> create table natura2000
> >>> (ffh_name character varying(80), ffh_land character
> varying(3), ffh
> >>> smallint, ffh_id smallint,
> >>> spa_name character varying(80), spa_land character
> varying(3), spa
> >>> smallint, spa_id smallint) with oids;
> >>> select
> >>>
> addgeometrycolumn('','natura2000','the_geom','31467','MULTIPOLYGON',2);
> >>> alter table natura2000 drop constraint enforce_geotype_the_geom;
> >>> insert into natura2000
> >>> select
> >>>
> t1.ffh_name,t1.ffh_land,t1.ffh,t1.ffh_id,t2.spa_name,t2.spa_land,t2.spa,t2.spa_id,
> >>> geomunion(t1.the_geom, t2.the_geom)
> >>> from ffh_rep t1, spa_rep t2;
> >>>
> >>> create table sg71
> >>> (ffh_name character varying(80), ffh_land character
> varying(3), ffh
> >>> smallint, ffh_id smallint,
> >>> spa_name character varying(80), spa_land character
> varying(3), spa
> >>> smallint, spa_id smallint,
> >>> bio_name character varying(70), bio smallint, bio_id
> smallint) with
> >>> oids;
> >>> select
> addgeometrycolumn('','sg71','the_geom','31467','MULTIPOLYGON',2);
> >>> alter table sg71 drop constraint enforce_geotype_the_geom;
> >>> insert into sg71
> >>> select t1.ffh_name, t1.ffh_land, t1.ffh, t1.ffh_id, t1.spa_name,
> >>> t1.spa_land, t1.spa, t1.spa_id,
> >>> t2.name,t2.bio,t2.bio_id,geomunion(t1.the_geom,
> t2.the_geom)
> >>> from natura2000 t1, biosphere t2;
> >>>
> >>> create table sg72
> >>> (ffh_name character varying(80), ffh_land character varying(3),
> >>> ffh smallint, ffh_id smallint,
> >>> spa_name character varying(80), spa_land character varying(3), spa
> >>> smallint, spa_id smallint,
> >>> bio_name character varying(70), bio smallint, bio_id smallint,
> >>> np_name character varying(60), np smallint, np_id smallint)
> with oids;
> >>> select
> addgeometrycolumn('','sg72','the_geom','31467','MULTIPOLYGON',2);
> >>> alter table sg72 drop constraint enforce_geotype_the_geom;
> >>> insert into sg72
> >>> select t1.ffh_name, t1.ffh_land, t1.ffh, t1.ffh_id, t1.spa_name,
> >>> t1.spa_land, t1.spa, t1.spa_id,
> >>> t1.bio_name,t1.bio,t1.bio_id,t2.np_name,t2.np,t2.np_id,
> >>> geomunion(t1.the_geom, t2.the_geom)
> >>> from sg71 t1, np t2;
> >>> AND SO ON......
> >>>
> >>>
> >>>
> >>>> or
> >>>>
> >>>> SELECT g1.somefield, geomunion(gt.the_geom)
> >>>> FROM (SELECT somefield, the_geom FROM g1 UNION SELECT somefield,
> the_geom FROM g2 ...) gt
> >>>> GROUP BY gt.somefield
> >>>>
> >>>>
> >>>> If I have 7 different tables that have pretty much the same
> structure, but for logistical or other technical reasons (such as each
> has additional attributes distinct from one another), I need to keep
> them as separate tables, then I usually use inherited tables for that.
> That way when I need to join all datasets at once, I can simply query
> the parent table and it will automatically drill down to the child
> tables. Not sure if that helps more than it confuses your situation.
> >>>>
> >>>> Then instead of the above I can simply do
> >>>>
> >>>> SELEG myparenttable.somefield, geomunion(myparenttable.the_geom)
> >>>> FROM myparenttable
> >>>> GROUP by gh.somefield
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>> ok - i have to think about your suggestions......that´s my second week
> >>> with postgis.
> >>> Can you tell me from my SQL-Statements which method will be best? So i
> >>> try to understand that one.....
> >>>
> >>> Thanks for your reply!!!
> >>>
> >>>
> >>>
> >>>> Hope that helps,
> >>>> Regina
> >>>>
> >>>> -----Original Message-----
> >>>> From: postgis-users-bounces at postgis.refractions.net
> [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of
> Andreas Laggner
> >>>> Sent: Thursday, August 23, 2007 9:11 AM
> >>>> To: PostGis_Mailinglist
> >>>> Subject: [postgis-users] Union of 7 datasets
> >>>>
> >>>> Hi users,
> >>>>
> >>>> i want to put together 7 datasets to have all the different
> refuges in
> >>>> one table (and in one geometry). Am i doing right with 6 times
> geomunion
> >>>> (that´s much to type with all the attributes) or is there a more
> >>>> effective way?
> >>>>
> >>>> cheers Andreas
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>
> >>
> >
> >
> >
>
>
> --
> Dipl. Geoökologe Andreas Laggner
> Institut für Ländliche Räume (LR)
> Bundesforschungsanstalt für Landwirtschaft (FAL)
>
> Institute of Rural Studies
> Federal Agricultural Research Centre (FAL)
>
> Bundesallee 50
> D-38116 Braunschweig
>
> Tel.: (+49) (0)531 596 5515
> Fax: (+49) (0)531 596 5599
> E-mail: andreas.laggner at fal.de
> Homepage: http://www.lr.fal.de/
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
> ------------------------------------------------------------------------
>
> *The substance of this message, including any attachments, may be
> confidential, legally privileged and/or exempt from disclosure
> pursuant to Massachusetts law. It is intended solely for the
> addressee. If you received this in error, please contact the sender
> and delete the material from any computer. *
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
--
Dipl. Geoökologe Andreas Laggner
Institut für Ländliche Räume (LR)
Bundesforschungsanstalt für Landwirtschaft (FAL)
Institute of Rural Studies
Federal Agricultural Research Centre (FAL)
Bundesallee 50
D-38116 Braunschweig
Tel.: (+49) (0)531 596 5515
Fax: (+49) (0)531 596 5599
E-mail: andreas.laggner at fal.de
Homepage: http://www.lr.fal.de/
More information about the postgis-users
mailing list