[postgis-users] https://postgis.net/docs/geometry_distance_knn.html and index usage

Jim Klassen klassen.js at gmail.com
Wed Jun 22 08:27:59 PDT 2022


The second note on the documentation page you referenced: "Index only kicks in if one of the geometries is a constant (not in a subquery/cte). e.g. 'SRID=3005;POINT(1011102 450541)'::geometry instead of a.geom".  Neither value passed to "<->" in your query is a constant.

On 6/22/22 09:22, Lars Aksel Opsahl wrote:
> Hi
>
> I have two simple tables.
>
> Table"pg_temp_30.g1temp"
> Column|Type| Collation | Nullable|Default
> --------+------------------------+-----------+----------+------------------------------------
> geo| geometry(Polygon,4258) | ||
> id | integer| | notnull| nextval('g1temp_id_seq'::regclass)
> Indexes:
> "g1temp_pkey"PRIMARYKEY, btree(id)
> "g1temp_geo_idx"gist(geo)
>
>
> Table"pg_temp_30.g2temp"
> Column|Type| Collation | Nullable|Default
> --------+------------------------+-----------+----------+------------------------------------
> geo| geometry(Polygon,4258) | ||
> id | integer| | notnull| nextval('g2temp_id_seq'::regclass)
> Indexes:
> "g2temp_pkey"PRIMARYKEY, btree(id)
> "g2temp_geo_idx"gist(geo)
>
>
> And I need to find the polygon in g2temp that has the longest distance to any polygon in d1temp.
> g1temp has 542 rows and table g2temp has 63483 rows. All polygons are simple rectangles.
>
> I run analyze
> analyzeg1temp;
> analyzeg2temp;
>
>
> And then run this query
>
> EXPLAINANALYZE
> WITHindex_query AS(
> SELECTg2.geo, ST_Distance(g1.geo, g2.geo,true) ASd, 1 asrunnning_jos
> FROM
> g1temp ASg1,
> g2temp ASg2
> ORDERBYg1.geo<-> g2.geodesc
> LIMIT1
> )
> SELECT*
> FROMindex_query;
>
> And I get this
>
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------------------
> Subquery Scan on index_query(cost=43613127.36..43613153.64 rows=1 width=132) (actual time=69029.773..69029.775 rows=1 loops=1)
> ->Limit(cost=43613127.36..43613153.63 rows=1 width=140) (actual time=69029.771..69029.772 rows=1 loops=1)
> ->Result(cost=43613127.36..947419646.12 rows=34407786 width=140) (actual time=69029.769..69029.770 rows=1 loops=1)
> ->Sort(cost=43613127.36..43699146.83 rows=34407786 width=292) (actual time=69019.575..69019.576 rows=1 loops=1)
> Sort Key: ((g1.geo <-> g2.geo)) DESC
> Sort Method: top-N heapsortMemory: 29kB
> ->Nested Loop(cost=0.00..43441088.43 rows=34407786 width=292) (actual time=0.184..58446.116 rows=34407786 loops=1)
> ->Seq Scan on g2temp g2(cost=0.00..1245.33 rows=63483 width=120) (actual time=0.020..31.744 rows=63483 loops=1)
> ->Materialize(cost=0.00..14.63 rows=542 width=160) (actual time=0.000..0.030 rows=542 loops=63483)
> ->Seq Scan on g1temp g1(cost=0.00..11.92 rows=542 width=160) (actual time=0.014..0.202 rows=542 loops=1)
> Planning Time: 0.320 ms
> Execution Time: 69029.872 ms
> (12 rows)​
>
> I do understand that this may take time, but I wonder why I see this "Seq scan" both on g1 and g2 ?
>
> From what I read on https://postgis.net/docs/geometry_distance_knn.html it should use indexes  in some way.
>
> Have also tested with casting geography when creating table g1 and g2 and that did not make any difference.
>
> Tested also with big workmem (1000 MB ) and the result was the same.
>
> I am running on POSTGIS="3.3.0dev 3.1.0alpha2-1532-gc8eedf3ae" [EXTENSION] PGSQL="120" GEOS="3.9.0-CAPI-1.16.2" PROJ="7.2.1" LIBXML="2.9.10" LIBJSON="0.13.1" LIBPROTOBUF="1.3.3" WAGYU="0.5.0 (Internal)" TOPOLOGY
> (1 row)
>
> Thanks.
>
> Lars
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20220622/eff04fa2/attachment.htm>


More information about the postgis-users mailing list