[postgis-tickets] [PostGIS] #2602: In some cases ST_Distance (geography) would be faster building the index each time
PostGIS
trac at osgeo.org
Mon Jan 6 14:43:39 PST 2014
#2602: In some cases ST_Distance (geography) would be faster building the index
each time
---------------------+------------------------------------------------------
Reporter: niqueco | Owner: pramsey
Type: defect | Status: new
Priority: medium | Milestone: PostGIS Future
Component: postgis | Version: trunk
Keywords: |
---------------------+------------------------------------------------------
For some complex inputs ST_Distance is very slow, but _st_distancetree
(which always builds the index) is fast.
As explained to me by Paul Ramsey in IRC (edited transcript):
<pramsey>
For any given SQL query that only run a single calcuation, you'll always
get the brute force calculation.
eg, SELECT ST_Distance(a.geog, b.geog) from a, b where a.id = 1 and b.id =
2;
On the other hand, if you run a repeated calculation on one geography you
should get cached behavior and circtree indexes coming into play: SELECT
ST_Distance(a.geog, b.geog) from a, b where a.id = 1;
So then we can circle back and ask if that logic is actually good logic.
Perhaps we should build and use the index every time
I think, for cases where the number of vertices is large enough the
overhead of building the index O(n) + O(m) then using it O(log(n)) +
O(log(m)) is going to be lower than the brute force cost O(m*n).
If the shapes are small, the machinery of the index will start to outweigh
the cost of just running the calculation and being done w/ it
<niqueco> can't be, perhaps, some heuristics.... with the number of
vertices?
<pramsey> I think, given your experience, it makes sense, since the
performance difference is so huge
Testcase: (with attached sql data loaded)
select _st_distancetree(geo1, geo2) from x;
Total runtime: 225.428 ms
select st_distance(geo1, geo2) from x;
Total runtime: 244821.796 ms
--
Ticket URL: <http://trac.osgeo.org/postgis/ticket/2602>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.
More information about the postgis-tickets
mailing list