[postgis-users] speeding up KNN query

David Rowley dgrowleyml at gmail.com
Mon Jul 22 08:01:01 PDT 2013


On Tue, Jul 23, 2013 at 2:17 AM, Ilia <ilia at lobsanov.com> wrote:

> Thanks David for your feedback. I got it down to 4ms or so. Below is the
> explain analyze. It's 2 subqueries now. Any ideas for simplification of the
> query?
>
> explain analyze WITH knn_query AS
> (
>  SELECT
>  *
>  FROM places
>  WHERE "places"."place_category_id" = 10
>  ORDER by geom <#> 'SRID=4326;POINT(-79.3937393 43.6526704)' LIMIT 50
>  )
> SELECT *,
> ST_Distance_Sphere(geom, 'SRID=4326;POINT(-79.3937393 43.6526704)') as
> distance
> FROM (
> SELECT * FROM knn_query
> ORDER BY ST_Distance(geom, 'SRID=4326;POINT(-79.3937393 43.6526704)') ASC
> LIMIT 5
> ) places;
>

I thought maybe:

SELECT *,ST_Distance_Sphere(geom, 'SRID=4326;POINT(-79.3937393
43.6526704)') as distance
FROM (SELECT * FROM places
      WHERE "places"."place_category_id" = 10
      ORDER by geom <#> 'SRID=4326;POINT(-79.3937393 43.6526704)'
      LIMIT 50) AS places
ORDER BY ST_Distance(geom, 'SRID=4326;POINT(-79.3937393 43.6526704)') ASC
LIMIT 5;


I don't think you really need the extra nest level in the sub query for the
one you wrote, but I assume you added it because you tried to do ORDER BY
distance? and got an error?
The good news is that ST_Distance is an immutable function, so PostgreSQL
likely won't execute it twice for the same inputs on the same query.
I also noticed you are using both ST_Distance_Sphere and ST_Distance(), is
this meant? I'm quite new to postgis, but I think the only difference is
ST_Distance_Sphere is faster is it's not working with a complex spheroid
shape. So if you're doing this for performance then it's not a good idea as
it means 2 functions need executed instead of 1 (at least with my query).

SELECT *,ST_Distance(geom, 'SRID=4326;POINT(-79.3937393 43.6526704)') as
distance
FROM (SELECT * FROM places
      WHERE "places"."place_category_id" = 10
      ORDER by geom <#> 'SRID=4326;POINT(-79.3937393 43.6526704)'
      LIMIT 50) AS places
ORDER BY ST_Distance(geom, 'SRID=4326;POINT(-79.3937393 43.6526704)') ASC
LIMIT 5;

Will likely be slightly faster, though I highly doubt you'll notice much
difference on 5 rows. But obviously it will give slightly different results.

I take it you're happy enough with the 4ms from 200ms?

Regards

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20130723/41666e44/attachment.html>


More information about the postgis-users mailing list