[postgis-users] [OFF-TOPIC] SQL question

Wed Feb 18 19:06:04 PST 2015

I have an interesting join problem that I'm not sure how to tackle in sql.

I have two tables that I need to join (well really match) like this:

create table a (
   gid serial,
   item text,
   data text
);

create table b (
   gid serial,
   item text,
   moredata text
);

gid can be used to force correct ordering of the records.
item can be one or more rows with the same value in it (for example 
there might be 10 item='foo' records)
data and more data is stuff related to the given record.

What I know is that:

1. if table a has N rows for a given item then table b will have N rows
2. that the N rows in table a and table b are in the same order by gid
3. it is not safe to assume that a.gid=b.gid will link the correct records

What I need is to match/join is:

a.item.row[1] to b.item.row[1]
a.item.row[2] to b.item.row[2]
...
a.item.row[N] to b.item.row[N]

where a.item=b.item

Any thoughts on how to solve this with SQL?

select aa.item, aa.cnt, bb.cnt
   from (select item, count(*) as cnt from a group by item) aa
   left outer join (select item, count(*) as cnt from b group by item) b
     on aa.item=bb.item
  order by aa.item;

So likewise, we should be able to do something like:

select aa.item, aa.data, bb.moredata
   from (select item, data from a order by item, gid) aa
   left outer join (select item, moredata from b order by item, gid) b
     on aa.item=bb.item and a.itemrow=b.itemrow
  order by aa.item;

This needs some way of assigning itemrow numbers that can be matched. I 
think this is an application of over() but I'm not quite sure how to 
apply it.

Thoughts would be appreciated :)

Thanks,
   -Steve