[postgis-users] Increase KNN efficiency
Regina Obe
lr at pcorp.us
Sat Jun 2 18:59:50 PDT 2018
Side note if you are using PostGIS >= 2.2 + PostgreSQL 9.5+, then no need to compute distance twice because <-> would give you true distance. So you can shorten a bit (and should improve speed slightly) with
SELECT
p1.id <http://p1.id> ,
t.id <http://t.id> AS id2,
t.dist
FROM parc p1
CROSS JOIN LATERAL
(SELECT
p2.id <http://p2.id> ,
p1.geom <-> p2.geom AS dist
FROM parc p2
WHERE p1.id <http://p1.id> <> p2.id <http://p2.id>
ORDER BY dist
LIMIT 1) AS t
ORDER BY p1.id <http://p1.id> , t.dist;
From: postgis-users [mailto:postgis-users-bounces at lists.osgeo.org] On Behalf Of Vispo Leblanc
Sent: Thursday, May 31, 2018 12:05 PM
To: postgis-users at lists.osgeo.org
Subject: Re: [postgis-users] Increase KNN efficiency
Thanks Nicolas, this is way better, I guess my previous query was calculating all distance between all elements.
Thanks again!
On Thu, May 31, 2018 at 2:47 AM Nicolas Ribot <nicolas.ribot at gmail.com <mailto:nicolas.ribot at gmail.com> > wrote:
Hi,
Put the KNN operator in a LATERAL subquery, using LIMIT x to limit NN to x elements:
SELECT
p1.id <http://p1.id> ,
t.id <http://t.id> AS id2,
t.dist
FROM parc p1
CROSS JOIN LATERAL
(SELECT
p2.id <http://p2.id> ,
st_distance(p1.geom, p2.geom) AS dist
FROM parc p2
WHERE p1.id <http://p1.id> <> p2.id <http://p2.id>
ORDER BY p1.geom <-> p2.geom
LIMIT 1) AS t
ORDER BY p1.id <http://p1.id> , t.dist;
Nicolas
On 30 May 2018 at 23:07, Vispo Leblanc <vispoleblanc at gmail.com <mailto:vispoleblanc at gmail.com> > wrote:
Hello PostGIS users,
I am looking for a way to increase the speed of my query with KNN.
I have two tables with millions of points in each table. I would like to find the nearest neighbor (only one) of my points in table A with my points in table B.
I wrote the following query:
WITH index_query as (
SELECT
t1.id <http://t1.id> ,
t1.geom AS geom,
t2.geom AS geom2,
t1.geom <-> t2.geom AS distance
FROM research_scrapper.table1 t1,
research_scrapper.table2 t2
ORDER BY t1.geom <-> t2.geom
)
SELECT tbl.cas_id, tbl.geom
FROM index_query tbl
INNER JOIN
(
SELECT geom2,
MIN(distance) as distance
FROM index_query
GROUP BY geom2
)tbl1
ON tbl1.geom_generated=tbl.geom_generated
AND tbl1.distance = tbl.distance
This is taking forever, would you have any suggestions on how to increase the efficiency?
Thanks for your help,
_______________________________________________
postgis-users mailing list
postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org>
https://lists.osgeo.org/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org>
https://lists.osgeo.org/mailman/listinfo/postgis-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20180602/165811f7/attachment.html>
More information about the postgis-users
mailing list