# [postgis-users] joining "nearest" data

Obe, Regina robe.dnd at cityofboston.gov
Tue Feb 13 15:01:08 PST 2007

```My general thought was to create a subselect that for every station gives you how far the closest solar is.  The reason we can't just pull the closest solar directly is that you can't include all the solar fields without grouping which would then ruin the min distance check

Step 1 - cartesian product that gives distance between each solar location from each station - this could be really slow depending on how big your tables are
SELECT station_id, distance(s.thepoint_meter, sol.thepoint_meter)  As dist
FROM station s, solar sol
(you are right I'm aliasing station s and sol for solar and I made the possibly incorrect assumption that you have a thepoint_meter in your solar table.  If you don't you'd have to do another alias join with station to get the point location.).

Step 2 - For each station tell me how far the closest solar is
SELECT station_id, min(distance(s.thepoint_meter, sol.thepoint_meter))  As mindist
FROM station s, solar sol
GROUP BY s.station_id

the mindist is now the alias for distance of closest solar.  We will call this whole subselect - solmin

Step 3 -  Which solar is that far away - this we use to pull the details of the solar that is mindist away from our station

(SELECT station_id, min(distance(s.thepoint_meter, sol.thepoint_meter))  As mindist
FROM station s, solar sol
GROUP BY s.station_id)  As solmin  INNER JOIN solar ON distance(station.thepoint_meter,
solar.thepoint_meter) = solmin.mindist

The OR station id I was hoping to save so query cycles, thinking if the station ids matched then it is that station, but I don't think there is must savings there.

Hope that explanation helps.

Regina

________________________________

From: postgis-users-bounces at postgis.refractions.net on behalf of Kirk Wythers
Sent: Tue 2/13/2007 5:23 PM
To: PostGIS Users Discussion
Subject: Re: [postgis-users] joining "nearest" data

On Feb 13, 2007, at 9:20 AM, Obe, Regina wrote:

> What't the nextval('climate_id_seq') in your view for?

nextval was actually a left over from earlier attempts to create a
workable primary key. The better solution was creating obs_id
(observation_id by combining station_id and date resulting in a
unique value since each site had only one observation date)

> Also wouldn't
> all your weather data have sites - seems like it should be sites LEFT
> OUTER JOIN weather.

You are correct.

>
> Here is my thought on how to get the nearest solar using a subselect
> that finds the mindistance between a station and a solar point.
> You may
> get more than one record if there are more solar locations equidistant
> from your station in close proximity to station.
>
>
> FROM (weather LEFT OUTER JOIN sites USING (station_id))
>       LEFT JOIN
>       (s.station_id, min(distance(s.thepoint_meter,
> sol.thepoint_meter) as mindist FROM solar sol, sites s GROUP BY
> s.station_id) solmin
>       ON solmin.station_id = sites.station_id  AND (solar.station_id =
> solmin.station_id or distance(station.thepoint_meter,
> solar.thepoint_meter) = solmin.mindist  )

I am having trouble reading my way through this. It looks like you
are using sol as an alias for solar and s for sites. Correct? Also,
how are you using GROUP BY? And can you explain solmin and mindist in
a little more detail for the weak of mind? Where did they come from?

_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users

-----------------------------------------
The substance of this message, including any attachments, may be
confidential, legally privileged and/or exempt from disclosure
pursuant to Massachusetts law. It is intended