[postgis-users] ST_DWithin slow if query is joined to another table, but fast if ST_DWithin cast to an integer

Paul Ramsey pramsey at cleverelephant.ca
Wed Nov 8 10:59:27 PST 2023



> On Nov 8, 2023, at 10:10 AM, Regina Obe via postgis-users <postgis-users at lists.osgeo.org> wrote:
> 
> First of all is that really your join clause?  You seem to be missing an ON
>  
> from building b
> join customer c

That error neatly explains the performance issue, since it results in a complete cross join of all the data I think? and then the filters get applied, but by then the damage is done…

P

>  
>  
> Also please output
>  
> EXPLAIN 
>  
> And the 
>  
> EXPLAIN ANALYZE of each query
>  
> From: postgis-users <postgis-users-bounces at lists.osgeo.org> On Behalf Of Cameron McCloud via postgis-users
> Sent: Wednesday, November 8, 2023 5:38 AM
> To: postgis-users at lists.osgeo.org
> Cc: Cameron McCloud <cameron.mccloud at gmail.com>
> Subject: Re: [postgis-users] ST_DWithin slow if query is joined to another table, but fast if ST_DWithin cast to an integer
>  
> Test code is here: https://github.com/cameronmccloud/postgis_slow_st_dwithin/blob/main/test.sql
>  
> On Wed, Nov 8, 2023 at 9:25 AM Cameron McCloud <cameron.mccloud at gmail.com <mailto:cameron.mccloud at gmail.com>> wrote:
>> Hi,
>>  
>> Postgis Version: 3.3.4
>> Postgres Version: 14.9
>>  
>> We have 2 tables, "building" with lat/long and "customer" with a FK to building. There's a 1:1 relationship between the two.
>>  
>> The test tables we're using have 16K rows each. Our production data has a lot more, but we could reproduce this on a smaller dataset.
>>  
>> We found some odd behaviour when using ST_DWITHIN in a WHERE clause but only when the "building" table is joined to the "customer" table.
>>  
>> We also found that converting the result of ST_DWITHIN to an integer (1/0 for true/false) and using the integer in the WHERE is fast.
>>  
>> -- This query works as expected and takes 60ms returning 6K rows.
>> select b.*
>> from building b
>> where
>>   st_dwithin(st_makepoint(b.longitude, b.latitude)::geography, st_makepoint(-96.7804060, 33.2471770)::geography, 50000);
>>          
>> -- This query is orders of magnitude slower - 3000ms, even though joining the two tables without the WHERE takes 30ms
>> select b.*
>> from building b
>> join customer c
>> where
>>   st_dwithin(st_makepoint(b.longitude, b.latitude)::geography, st_makepoint(-96.7804060, 33.2471770)::geography, 50000);
>>  
>> -- This query converts the result of ST_DWITHIN to an integer. It's fast and takes 80ms
>> select b.*
>> from building b
>> join customer c
>> where
>>   case st_dwithin(st_makepoint(b.longitude, b.latitude)::geography, st_makepoint(-96.7804060, 33.2471770)::geography, 50000)
>>     when true then 1
>>     else 0
>>   end = 1;
>>  
>> We have no idea why this is the case, but we're curious. In our production scenario using the "case...when" brought a query down from 6 minutes to 6 seconds. We'd love to know why this might be and if there are other scenarios like this that could increase the performance of our Postgis queries.
>>  
>> Thanks,
>>  
>> Cam.
>>  
>>  
>>  
>>  
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20231108/3b2a794d/attachment.htm>


More information about the postgis-users mailing list