<html><head><meta http-equiv="Content-Type" content="text/html charset=windows-1252"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">Hi List<br><br>I’ve been playing with different approaches to KNN problem, given the spatial density of features often varies dramatically across urban and rural settings. I’ve written a recursive CTE function, that uses ST_DWithin recursively, expanding the distance threshold each time (but only if necessary). The function can be found here at github:<br><br><a href="https://raw.githubusercontent.com/dimensionaledge/cf_public/master/points/DE_KNN.sql">https://raw.githubusercontent.com/dimensionaledge/cf_public/master/points/DE_KNN.sql</a><br><br>I’ve checked out the Boston GIS pgis_fn_nn example and write-up, but am increasingly erring toward the use of recursive CTE functions as my knowledge grows around how they can be applied to dynamic problems.<br><br>For the KNN recursive function, I’ve compared performance for 3 different use cases, each returning 10500 records.<br><br>Use Case:<br>1) 3500 origin points, 1.8 million Linestring features, K=3…. Query time = 20 seconds.<br>2) 3500 origin points, 3500 Point features, K=3…. Query time = 212 seconds or 3.5 minutes.<br>3) 3500 origin points, 3500 Polygon features (the same points as in (2) except buffered by 1 meter), K=3…. Query time = 7 seconds.<br><br>What surprised me is the results of (3) versus (2). From 212 seconds down to 7 seconds.<br><br>There were geometry indexes on all use cases, but the only difference with (3) is that I buffered the feature set in (2) by 1 meter, in effect creating a Polygon representation of the points.<br><br>I sense this has something to do with the bounding box comparison used by ST_DWithin, as mentioned in the PostGIS manual. I’ve also studied Paul’s tip here:<br><a href="http://postgis.net/2013/08/26/tip_ST_DWithin">http://postgis.net/2013/08/26/tip_ST_DWithin</a><br><br>I’m keen to get the insights of others, and learn ways of further improving my code patterns … ? How have others dealt with KNN problems in PostGIS when working with BIG point feature sets?<br><br>Kind regards<br><br>Mark<br><br>1) Query Form<br>WITH results AS (SELECT ogc_fid, DE_knn(wkb_geometry, 'prep.osm_new_roads_poa11_cleaned_full', 'wkb_geometry', 'lid', 0.0001, 10,3) as knn FROM jtw.nsw_tz_centroids)<br>SELECT ogc_fid, (knn).id as lid, (knn).distance, row_number() over (PARTITION BY ogc_fid ORDER BY 3 ASC) FROM results;<br><br>2) Query Form<br>WITH results AS (SELECT ogc_fid, DE_knn(wkb_geometry, 'jtw.nsw_tz_centroids', 'wkb_geometry', 'ogc_fid', 0.0001,20,8) as knn FROM jtw.nsw_tz_centroids)<br>SELECT ogc_fid, (knn).id as lid, (knn).distance, row_number() over (PARTITION BY ogc_fid ORDER BY 3 ASC) FROM results;<br><br>3) Query Form<br>WITH results AS (SELECT ogc_fid, DE_knn(wkb_geometry, 'jtw.nsw_tz_centroids_buffers', 'wkb_geometry', 'ogc_fid', 0.0001,20,3) as knn FROM jtw.nsw_tz_centroids)<br>SELECT ogc_fid, (knn).id as lid, (knn).distance, row_number() over (PARTITION BY ogc_fid ORDER BY 3 ASC) FROM results;</body></html>