[postgis-tickets] [PostGIS] #2602: In some cases ST_Distance (geography) would be faster building the index each time

PostGIS trac at osgeo.org
Mon Jan 6 14:43:39 PST 2014


#2602: In some cases ST_Distance (geography) would be faster building the index
each time
---------------------+------------------------------------------------------
 Reporter:  niqueco  |       Owner:  pramsey       
     Type:  defect   |      Status:  new           
 Priority:  medium   |   Milestone:  PostGIS Future
Component:  postgis  |     Version:  trunk         
 Keywords:           |  
---------------------+------------------------------------------------------
 For some complex inputs ST_Distance is very slow, but _st_distancetree
 (which always builds the index) is fast.

 As explained to me by Paul Ramsey in IRC (edited transcript):

 <pramsey>
 For any given SQL query that only run a single calcuation, you'll always
 get the brute force calculation.

 eg, SELECT ST_Distance(a.geog, b.geog) from a, b where a.id = 1 and b.id =
 2;

 On the other hand, if you run a repeated calculation on one geography you
 should get cached behavior and circtree indexes coming into play: SELECT
 ST_Distance(a.geog, b.geog) from a, b where a.id = 1;

 So then we can circle back and ask if that logic is actually good logic.
 Perhaps we should build and use the index every time

 I think, for cases where the number of vertices is large enough the
 overhead of building the index O(n) + O(m) then using it O(log(n)) +
 O(log(m)) is going to be lower than the brute force cost O(m*n).
 If the shapes are small, the machinery of the index will start to outweigh
 the cost of just running the calculation and being done w/ it

 <niqueco> can't be, perhaps, some heuristics.... with the number of
 vertices?

 <pramsey> I think, given your experience, it makes sense, since the
 performance difference is so huge


 Testcase: (with attached sql data loaded)

 select _st_distancetree(geo1, geo2) from x;
  Total runtime: 225.428 ms

 select st_distance(geo1, geo2) from x;
  Total runtime: 244821.796 ms

-- 
Ticket URL: <http://trac.osgeo.org/postgis/ticket/2602>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.


More information about the postgis-tickets mailing list