[postgis-users] ST_Distance_Sphere too slow
woodbri at swoodbridge.com
Thu Mar 24 16:09:06 PDT 2011
On 3/24/2011 6:14 PM, Julian Perelli wrote:
> I don't know if st_expand is faster than st_distance_sphere but with
> EXPLAIN I saw 2 seq scan, and with your approach I see now one seq
> scan, and one index scan, so I suppose your approach is much better,
> and I think that it couldn't be better.
> Could you explain me why that happens? why now is a index scan where
> before was a seq scan?
Spatial indexes only work on the bbox of the geometry and the && is try
if the two geometry bboxes interact with one another.
So if you hav geometry A and B then A && B compares their bboxes. But in
your case you want to find any two geometries that might be as far apart
as 300 meters. If A and B are 300 meters apart then their bboxes will
not interact and you will never be able to select them for computing the
distance. But if we expand the bbox of one of the geometries by 300
meters, then it will interact with the bbox in question and we will
check the distance. If the distance is greater then 300m we will throw
it out anyway.
So && is a fast select using indexes and bboxes.
The distance is not performed on all the pairs that are obviously too
far apart, and since this is an expense computation this is good.
Performance is based more on the false index matches that get rejected
by the more costly distance calculation. If there is not a high
percentage of these then index searchs are extremely fast.
> I made a subset table (only 10 paths!!) to test.
> with your SQL it takes 2,8 sec
> with mine it takes 5,0 sec to complete.
> with a subset of 100 paths I let it for an hour and it doesn't finished.
> the problem is that it's almost unacceptable, with 1000 paths it will
> take eternity!
> (I think it takes exponential time as the subset grows up because of
> the self join)
> maybe is there another way to do this faster... I dont know...
> I want to run this just once, to make a table with this temporary
> results for a larger query in my application, so if it takes 12 hours
> to complete this operation, is ok.
> 2011/3/24 Stephen Woodbridge<woodbri at swoodbridge.com>:
>> I would make sure there is an gist index on path and use a query like:
>> table t1
>> inner join table t2
>> on (t1.path&& st_expand(t2.path, 300/111120) and t1.id!=t2.id and
>> ST_Distance_Sphere(t1.path, t2.path)< 300);
>> the 300/111120 is to convert 300 meters into approximately degrees. If you
>> are worried about some near misses you can expand it a little more and
>> distance_sphere will filter extras out of the results.
>> -Steve W
>> On 3/24/2011 3:43 PM, Julian Perelli wrote:
>>> Hello Postgis List!
>>> I'm trying to get the pair of paths that crosses each other or are 300
>>> meters or less distant from each other, making this SQL.
>>> table t1
>>> inner join table t2
>>> on (t1.id!=t2.id and ST_Distance_Sphere(t1.path, t2.path)< 300);
>>> it was 14 hours running and it doesn't finish... I have 1200+ rows in
>>> the table, each path has between 100 and 500 points.
>>> I tried to make an index on the path column, but when I use explain on
>>> the query, it seems that pg doesn't use the index.
>>> should I increase the memory assigned to pgsql?
>>> I don't know where to begin, what to do, to make this query faster.
>>> Maybe I have an error and it just hangs up.
>>> It would be nice to know how to debug the query.. to see it running or
>>> something like that. EXPLAIN helps, but not too much.
>>> postgis-users mailing list
>>> postgis-users at postgis.refractions.net
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
> postgis-users mailing list
> postgis-users at postgis.refractions.net
More information about the postgis-users