[postgis-users] speeding up KNN query
David Rowley
dgrowleyml at gmail.com
Mon Jul 22 08:01:01 PDT 2013
On Tue, Jul 23, 2013 at 2:17 AM, Ilia <ilia at lobsanov.com> wrote:
> Thanks David for your feedback. I got it down to 4ms or so. Below is the
> explain analyze. It's 2 subqueries now. Any ideas for simplification of the
> query?
>
> explain analyze WITH knn_query AS
> (
> SELECT
> *
> FROM places
> WHERE "places"."place_category_id" = 10
> ORDER by geom <#> 'SRID=4326;POINT(-79.3937393 43.6526704)' LIMIT 50
> )
> SELECT *,
> ST_Distance_Sphere(geom, 'SRID=4326;POINT(-79.3937393 43.6526704)') as
> distance
> FROM (
> SELECT * FROM knn_query
> ORDER BY ST_Distance(geom, 'SRID=4326;POINT(-79.3937393 43.6526704)') ASC
> LIMIT 5
> ) places;
>
I thought maybe:
SELECT *,ST_Distance_Sphere(geom, 'SRID=4326;POINT(-79.3937393
43.6526704)') as distance
FROM (SELECT * FROM places
WHERE "places"."place_category_id" = 10
ORDER by geom <#> 'SRID=4326;POINT(-79.3937393 43.6526704)'
LIMIT 50) AS places
ORDER BY ST_Distance(geom, 'SRID=4326;POINT(-79.3937393 43.6526704)') ASC
LIMIT 5;
I don't think you really need the extra nest level in the sub query for the
one you wrote, but I assume you added it because you tried to do ORDER BY
distance? and got an error?
The good news is that ST_Distance is an immutable function, so PostgreSQL
likely won't execute it twice for the same inputs on the same query.
I also noticed you are using both ST_Distance_Sphere and ST_Distance(), is
this meant? I'm quite new to postgis, but I think the only difference is
ST_Distance_Sphere is faster is it's not working with a complex spheroid
shape. So if you're doing this for performance then it's not a good idea as
it means 2 functions need executed instead of 1 (at least with my query).
SELECT *,ST_Distance(geom, 'SRID=4326;POINT(-79.3937393 43.6526704)') as
distance
FROM (SELECT * FROM places
WHERE "places"."place_category_id" = 10
ORDER by geom <#> 'SRID=4326;POINT(-79.3937393 43.6526704)'
LIMIT 50) AS places
ORDER BY ST_Distance(geom, 'SRID=4326;POINT(-79.3937393 43.6526704)') ASC
LIMIT 5;
Will likely be slightly faster, though I highly doubt you'll notice much
difference on 5 rows. But obviously it will give slightly different results.
I take it you're happy enough with the 4ms from 200ms?
Regards
David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20130723/41666e44/attachment.html>
More information about the postgis-users
mailing list