[postgis-users] Increase KNN efficiency

Nicolas Ribot nicolas.ribot at gmail.com
Sun Jun 3 01:53:33 PDT 2018


Hi Regina,

Thanks for the note !

Nicolas

On 3 June 2018 at 03:59, Regina Obe <lr at pcorp.us> wrote:

> Side note if you are using PostGIS >= 2.2 + PostgreSQL 9.5+, then no need
> to compute distance twice because <-> would give you true distance.  So you
> can shorten a bit (and should improve speed slightly) with
>
>
>
> SELECT
>
>   p1.id,
>
>   t.id AS id2,
>
>   t.dist
>
> FROM parc p1
>
>   CROSS JOIN LATERAL
>
>              (SELECT
>
>                 p2.id,
>
>                p1.geom <-> p2.geom AS dist
>
>               FROM parc p2
>
>               WHERE p1.id <> p2.id
>
>               ORDER BY dist
>
>               LIMIT 1) AS t
>
> ORDER BY p1.id, t.dist;
>
>
>
>
>
> *From:* postgis-users [mailto:postgis-users-bounces at lists.osgeo.org] *On
> Behalf Of *Vispo Leblanc
> *Sent:* Thursday, May 31, 2018 12:05 PM
> *To:* postgis-users at lists.osgeo.org
> *Subject:* Re: [postgis-users] Increase KNN efficiency
>
>
>
> Thanks Nicolas, this is *way better*, I guess my previous query was
> calculating all distance between all elements.
>
>
>
> Thanks again!
>
>
>
> On Thu, May 31, 2018 at 2:47 AM Nicolas Ribot <nicolas.ribot at gmail.com>
> wrote:
>
> Hi,
>
>
>
> Put the KNN operator in a LATERAL subquery, using LIMIT x to limit NN to x
> elements:
>
>
>
> SELECT
>
>   p1.id,
>
>   t.id AS id2,
>
>   t.dist
>
> FROM parc p1
>
>   CROSS JOIN LATERAL
>
>              (SELECT
>
>                 p2.id,
>
>                 st_distance(p1.geom, p2.geom) AS dist
>
>               FROM parc p2
>
>               WHERE p1.id <> p2.id
>
>               ORDER BY p1.geom <-> p2.geom
>
>               LIMIT 1) AS t
>
> ORDER BY p1.id, t.dist;
>
>
>
> Nicolas
>
>
>
> On 30 May 2018 at 23:07, Vispo Leblanc <vispoleblanc at gmail.com> wrote:
>
> Hello PostGIS users,
>
>
>
> I am looking for a way to increase the speed of my query with KNN.
>
>
>
> I have two tables with millions of points in each table. I would like to
> find the nearest neighbor (only one) of my points in table A with my points
> in table B.
>
>
>
> I wrote the following query:
>
>
>
> WITH index_query as (
>   SELECT
>     t1.id,
>     t1.geom AS geom,
>     t2.geom AS geom2,
>     t1.geom <-> t2.geom AS distance
>   FROM research_scrapper.table1 t1,
>       research_scrapper.table2 t2
>   ORDER BY t1.geom <-> t2.geom
> )
> SELECT tbl.cas_id, tbl.geom
> FROM index_query tbl
> INNER JOIN
>     (
>         SELECT geom2,
>                MIN(distance) as distance
>         FROM index_query
>         GROUP BY geom2
>     )tbl1
>     ON tbl1.geom_generated=tbl.geom_generated
>     AND tbl1.distance = tbl.distance
>
>
>
> This is taking forever, would you have any suggestions on how to increase
> the efficiency?
>
>
>
> Thanks for your help,
>
>
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users
>
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20180603/f1a3e693/attachment.html>


More information about the postgis-users mailing list