[postgis-devel] ST_DWithin Performance

Zhihong Zhang zhihong at gmail.com
Sat Jan 6 09:06:00 PST 2018


I always run vacuum before benchmarking so statistics should be up to date.

Here are plans for two different queries I use. The one with _ST_DWithin is always faster.

Straight ST_DWithin:

explain analyze select count(*) from assets where ST_DWithin(geometry, 'SRID=4326;POINT (59 32)'::geometry, 10.664129861033185)

Aggregate  (cost=18204.79..18204.80 rows=1 width=8) (actual time=45.987..45.987 rows=1 loops=1)
  ->  Bitmap Heap Scan on assets  (cost=708.70..18201.50 rows=1316 width=0) (actual time=3.298..42.899 rows=16437 loops=1)
        Recheck Cond: (geometry && '0103000020E610000001000000050000006048EFCAFD2A4840C190DE95FB5535406048EFCAFD2A4840A0B7103502554540D05B881A816A5140A0B7103502554540D05B881A81
6A5140C190DE95FB5535406048EFCAFD2A4840C190DE95FB553540'::geometry)
        Filter: (('0101000020E61000000000000000804D400000000000004040'::geometry && st_expand(geometry, '10.6641298610332'::double precision)) AND _st_dwithin(geometry, '010100
0020E61000000000000000804D400000000000004040'::geometry, '10.6641298610332'::double precision))
        Rows Removed by Filter: 3146
        Heap Blocks: exact=5484
        ->  Bitmap Index Scan on assets_geometry_idx  (cost=0.00..708.38 rows=19746 width=0) (actual time=2.640..2.640 rows=19583 loops=1)
              Index Cond: (geometry && '0103000020E610000001000000050000006048EFCAFD2A4840C190DE95FB5535406048EFCAFD2A4840A0B7103502554540D05B881A816A5140A0B7103502554540D05B88
1A816A5140C190DE95FB5535406048EFCAFD2A4840C190DE95FB553540'::geometry)
Planning time: 1.572 ms
Execution time: 46.203 ms

_ST_DWithin with one check:

explain analyze select count(*) from assets where geometry && ST_Expand('SRID=4326;POINT (59 32)'::geometry, 10.664129861033185)  AND _ST_DWithin(geometry, 'SRID=4326;POINT (59 32)'::geometry, 10.6641298610331);

Aggregate  (cost=18120.54..18120.55 rows=1 width=8) (actual time=41.745..41.746 rows=1 loops=1)
  ->  Bitmap Heap Scan on assets  (cost=710.02..18104.09 rows=6582 width=0) (actual time=8.279..36.987 rows=16437 loops=1)
        Recheck Cond: (geometry && '0103000020E610000001000000050000006048EFCAFD2A4840C190DE95FB5535406048EFCAFD2A4840A0B7103502554540D05B881A816A5140A0B7103502554540D05B881A81
6A5140C190DE95FB5535406048EFCAFD2A4840C190DE95FB553540'::geometry)
        Filter: _st_dwithin(geometry, '0101000020E61000000000000000804D400000000000004040'::geometry, '10.6641298610331'::double precision)
        Rows Removed by Filter: 3146
        Heap Blocks: exact=5484
        ->  Bitmap Index Scan on assets_geometry_idx  (cost=0.00..708.38 rows=19746 width=0) (actual time=6.104..6.104 rows=19583 loops=1)
              Index Cond: (geometry && '0103000020E610000001000000050000006048EFCAFD2A4840C190DE95FB5535406048EFCAFD2A4840A0B7103502554540D05B881A816A5140A0B7103502554540D05B88
1A816A5140C190DE95FB5535406048EFCAFD2A4840C190DE95FB553540'::geometry)
Planning time: 0.633 ms
Execution time: 41.833 ms

This is a tiny test table with about 100k rows. As you can see, it takes long time to get a count. Our geometry columns always contain mixed types (points, line-string, polygon, multi-polygon). Wonder if that makes it slow.

Thanks for your help!

Zhihong


> On Jan 6, 2018, at 11:13 AM, Darafei Komяpa Praliaskouski <me at komzpa.net> wrote:
> 
> Hello,
> 
> Would you please post the query plan please?
> It feels like "all geometry columns are indexed" may switch plan to Bitmap And, and that indeed may be slower. 
> Is statistics up-to-date?
> 
> сб, 6 янв. 2018 г. в 19:10, Zhihong Zhang <zhihong at gmail.com <mailto:zhihong at gmail.com>>:
> Thanks for the explanation. I understand the reason for the 2nd check now.
> 
> All our geometry columns are indexed so the 2nd check is redundant, which causes the performance degradation.
> 
> Zhihong
> 
>> On Jan 6, 2018, at 11:00 AM, Nicklas Avén <nicklas.aven at jordogskog.no <mailto:nicklas.aven at jordogskog.no>> wrote:
>> 
>> Hello
>> 
>> If you get a difference as large as 20% I guess you are not using spatial indexes.
>> 
>> The reaason for that implementation is that it gives the query planner the possibility to choose if it shall iterate table1 and use the index of table 2 to find overlaps, or if it shall iterate table 2 instead and use the index of table 1 to find overlap.
>> 
>> The index cannot be used on an expanded geometry (snce the bbox in the index is for the original geometry) so without this double check the planner wouldn't have the choice.
>> 
>> That would be very bad if only 1 of the tables have an index. It is also a bad thing if the table that gets the sequential scan is the one with millions of rows and the index is used on the table with just a few rows.
>> 
>> /Nicklas
>> 
>> 
>> On Sat, 2018-01-06 at 10:14 -0500, Zhihong Zhang wrote:
>>> ST_DWithin is implemented like this,
>>> 
>>> CREATE OR REPLACE FUNCTION ST_DWithin(geom1 geometry, geom2 geometry, float8 RETURNS boolean
>>> AS 'SELECT $1 OPERATOR(@extschema at .&&) @extschema at .ST_Expand($2,$3) AND $2 OPERATOR(@extschema at .&&) @extschema at .ST_Expand($1,$3) AND @extschema at ._ST_DWithin($1, $2, $3)'
>>> LANGUAGE 'sql' IMMUTABLE _PARALLEL;
>>> 
>>> What’s the purpose of the second bounding box check? 
>>> 
>>> I did many tests. It’s always faster (about 20%) with one check. So we ended up using our own version of ST_DWithin.
>>> 
>>> Thanks!
>>> 
>>> Zhihong
>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> postgis-devel mailing list
>>> postgis-devel at lists.osgeo.org <mailto:postgis-devel at lists.osgeo.org>
>>> https://lists.osgeo.org/mailman/listinfo/postgis-devel <https://lists.osgeo.org/mailman/listinfo/postgis-devel>_______________________________________________
>> postgis-devel mailing list
>> postgis-devel at lists.osgeo.org <mailto:postgis-devel at lists.osgeo.org>
>> https://lists.osgeo.org/mailman/listinfo/postgis-devel <https://lists.osgeo.org/mailman/listinfo/postgis-devel>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org <mailto:postgis-devel at lists.osgeo.org>
> https://lists.osgeo.org/mailman/listinfo/postgis-devel <https://lists.osgeo.org/mailman/listinfo/postgis-devel>_______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20180106/6da761c1/attachment-0001.html>


More information about the postgis-devel mailing list