[postgis-users] Distance() vs ? - performance

Paul Ramsey pramsey at refractions.net
Tue Feb 17 17:57:22 PST 2004


I would suggest doing this with a little external scripting on top of 
the database.  SQL is *not* the optimal language for GIS processing, so 
layering a little logic on top can make a world of difference. Finding a 
minimum distance is already hard, doing for every combination is 
particularly nasty. You will have to tune some magic parameters...

Fire up your perl, or php, or java interpretter (exceptionally pseudo 
code follows):

foreach forest_poly {
	rowset = select * from roads where road_geom && box(forest_poly 
expanded by reasonable search distance) order by 
distance(road_geom,forest_poly)
         if size(rowset) > 0 then return rowset(0)
         else expand search distance and start again
}

I guarantee this process will return you the results in a much more 
reasonable amount of time. The "reasonable search distance" is the magic 
number. You want it to return "a few" roads in "most" forest poly cases, 
but with an expansion algorithm that doubles it whenever you turn up 0 
candidates things should be pretty efficient.

Paul


Tyler Mitchell wrote:

> 
>  > This is probably one of those cases where a fuller explanation of your
>  > problem would help. What are you calculating distances *for*, what are
>  > your inputs, desired outputs, general description of the problem. It is
>  > possible that a different approach to your problem could yeild a better
>  > result.
> 
> I thought I had! :)
> 
> I've got 54,000 polygons covering an area of 611,000 hectares.  These 
> polygons represent forest cover types and they are uniquely identified 
> with a field called link.  The table is called vri_polygons.
> 
> I have another table with road linework.  95,000 features spread over 
> the exact same landbase.  The table is called trim_roads.
> 
> What I want to do is calculate the shortest distance from each polygon 
> to each road.  I want a list of link's and a minium distance to roads.
> 
> Here's what I initially did but found performance and memory usage went 
> gonzo.
> 
> CREATE table vri_road_dist as
> (
> SELECT vri.link, 
> min(distance(vri.wkb_geometry,roads.wkb_geometry)::integer) AS 
> MIN_ROAD_DIST
> FROM vri_polygons vri, trim_roads roads
> GROUP BY vri.link
> );
> 
> I assume this forces a cartesian join and tries to load both 
> intersecting tables into RAM?  Both tables have GIST indexes.
> 
> Tyler
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users


-- 
       __
      /
      | Paul Ramsey
      | Refractions Research
      | Email: pramsey at refractions.net
      | Phone: (250) 885-0632
      \_




More information about the postgis-users mailing list