[postgis-users] speeding up KNN query

Ilia ilia at lobsanov.com
Mon Jul 22 13:48:26 PDT 2013


I had change the query since yesterday. See below.
It is now taking 115ms, which is quite slow.
I'm now ordering by ST_Distance_Sphere instead of ST_Distance.

WITH knn_query AS
 (
 SELECT
 *
 FROM places
 WHERE "places"."place_category_id" = 10
 ORDER by geom <#> 'SRID=4326;POINT(-79.3937393 43.6526704)' LIMIT 50
 )
 SELECT *,
 ST_Distance_Sphere(geom, 'SRID=4326;POINT(-79.3937393 43.6526704)') as
distance
 FROM knn_query
 ORDER BY distance ASC
 LIMIT 5

Here's the explain analyze:

 Limit  (cost=169.13..169.14 rows=5 width=1704) (actual
time=115.621..115.622 rows=5 loops=1)
   CTE knn_query
     ->  Limit  (cost=0.00..154.68 rows=50 width=3406) (actual
time=0.179..1.006 rows=50 loops=1)
           ->  Index Scan using index_places_on_geom on places
 (cost=0.00..3498.78 rows=1131 width=3406) (actual time=0.178..0.995
rows=50 loops=1)
                 Order By: (geom <#>
'0101000020E61000009829520633D953C09B8823B48AD34540'::geometry)
                 Filter: (place_category_id = 10)
                 Rows Removed by Filter: 136
   ->  Sort  (cost=14.46..14.58 rows=50 width=1704) (actual
time=115.620..115.620 rows=5 loops=1)
         Sort Key: (_st_distance(geography(knn_query.geom),
'0101000020E61000009829520633D953C09B8823B48AD34540'::geography, 0::double
precision, false))
         Sort Method: top-N heapsort  Memory: 29kB
         ->  CTE Scan on knn_query  (cost=0.00..13.62 rows=50 width=1704)
(actual time=67.064..115.540 rows=50 loops=1)
 Total runtime: 115.698 ms



On Mon, Jul 22, 2013 at 11:01 AM, David Rowley <dgrowleyml at gmail.com> wrote:

> On Tue, Jul 23, 2013 at 2:17 AM, Ilia <ilia at lobsanov.com> wrote:
>
>> Thanks David for your feedback. I got it down to 4ms or so. Below is the
>> explain analyze. It's 2 subqueries now. Any ideas for simplification of the
>> query?
>>
>> explain analyze WITH knn_query AS
>> (
>>  SELECT
>>  *
>>  FROM places
>>  WHERE "places"."place_category_id" = 10
>>  ORDER by geom <#> 'SRID=4326;POINT(-79.3937393 43.6526704)' LIMIT 50
>>  )
>> SELECT *,
>> ST_Distance_Sphere(geom, 'SRID=4326;POINT(-79.3937393 43.6526704)') as
>> distance
>> FROM (
>> SELECT * FROM knn_query
>> ORDER BY ST_Distance(geom, 'SRID=4326;POINT(-79.3937393 43.6526704)') ASC
>> LIMIT 5
>> ) places;
>>
>
> I thought maybe:
>
> SELECT *,ST_Distance_Sphere(geom, 'SRID=4326;POINT(-79.3937393
> 43.6526704)') as distance
> FROM (SELECT * FROM places
>
>       WHERE "places"."place_category_id" = 10
>       ORDER by geom <#> 'SRID=4326;POINT(-79.3937393 43.6526704)'
>       LIMIT 50) AS places
>
> ORDER BY ST_Distance(geom, 'SRID=4326;POINT(-79.3937393 43.6526704)') ASC
> LIMIT 5;
>
>
> I don't think you really need the extra nest level in the sub query for
> the one you wrote, but I assume you added it because you tried to do ORDER
> BY distance? and got an error?
> The good news is that ST_Distance is an immutable function, so PostgreSQL
> likely won't execute it twice for the same inputs on the same query.
> I also noticed you are using both ST_Distance_Sphere and ST_Distance(), is
> this meant? I'm quite new to postgis, but I think the only difference is
> ST_Distance_Sphere is faster is it's not working with a complex spheroid
> shape. So if you're doing this for performance then it's not a good idea as
> it means 2 functions need executed instead of 1 (at least with my query).
>
> SELECT *,ST_Distance(geom, 'SRID=4326;POINT(-79.3937393 43.6526704)') as
> distance
> FROM (SELECT * FROM places
>
>       WHERE "places"."place_category_id" = 10
>       ORDER by geom <#> 'SRID=4326;POINT(-79.3937393 43.6526704)'
>        LIMIT 50) AS places
>
> ORDER BY ST_Distance(geom, 'SRID=4326;POINT(-79.3937393 43.6526704)') ASC
> LIMIT 5;
>
> Will likely be slightly faster, though I highly doubt you'll notice much
> difference on 5 rows. But obviously it will give slightly different results.
>
> I take it you're happy enough with the 4ms from 200ms?
>
> Regards
>
> David
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20130722/5f250be8/attachment.html>


More information about the postgis-users mailing list