[postgis-users] Finding records withing XX distance from a geometry in the same table

Alexander W. Rolek a.rolek at gmail.com
Thu Jul 10 09:54:25 PDT 2014


This looks great. I will give it a run tonight and let you know the
results. I really appreciate the suggestions.


On Thu, Jul 10, 2014 at 9:36 AM, Rémi Cura <remi.cura at gmail.com> wrote:

> Oups, sent it to early.
> If you trust the planner you can simply try :
>
> WITH my_input_polygon AS (
>  SELECT geom
> FROM sdgis.parcels4326 AS parcels
>  WHERE apn = 3500600300
> )
> SELECT mip.*
> FROM my_input_polygon AS mip
> INNER JOIN sdgis.parcels4326 AS p ON
> (
> (ST_DWITHIN(mip.geom,p.geom,2*100)=TRUE)
>  AND (ST_DWITHIN(mip.geom::geography,p.geom::geography,
> 100)=TRUE)
> )
>
> Cheers,
> Rémi-C
>
>
> 2014-07-10 18:33 GMT+02:00 Rémi Cura <remi.cura at gmail.com>:
>
> A single statement
>> , the 2 step is only metaphorical (or can be enforced if you use CTE).
>> Somehting liek this (untested)
>>
>> WITH my_input_polygon AS (
>>  SELECT geom
>> FROM sdgis.parcels4326 AS parcels
>>  WHERE apn = 3500600300
>> )
>> ,filtering_with_geometry AS (
>> SELECT p.*
>> FROM my_input_polygon AS mip
>> INNER JOIN sdgis.parcels4326 AS p ON
>> (ST_DWITHIN(mip.geom,p.geom,2*100)=TRUE)
>> )
>> SELECT fw.*
>> FROM filtering_with_geometry AS fw
>> INNER JOIN sdgis.parcels4326 AS p ON
>> (ST_DWITHIN(mip.geom::geography,p.geom::geography,
>> 100)=TRUE)
>>
>> Cheers,
>> Rémi-C
>>
>>
>> 2014-07-10 18:01 GMT+02:00 Alexander W. Rolek <a.rolek at gmail.com>:
>>
>> The 2 step query is an interesting idea. Would I need to do that in 2
>>> queries, or can that be accomplished in a single statement?
>>>
>>>
>>>
>>> On Thu, Jul 10, 2014 at 8:58 AM, Rémi Cura <remi.cura at gmail.com> wrote:
>>>
>>>> Hey,
>>>> I'm glad you found out.
>>>> I never tried but you could probably try to index on the result of the
>>>> cast :
>>>> CREATE INDEX ON ... USING GIST ((CAST geom AS GEOGRAPHY))
>>>> .
>>>>
>>>> However if your distance are not that big (compaired to earth curve),
>>>> stick to geometry.
>>>> If your distances are big, go geography.
>>>> You could also simply do it in 2 steps, a first step with geometry and
>>>> maybe 2 times the distance to be sure on all the table, a second step more
>>>> precise with geography on only the result of the previous filter.
>>>>
>>>> Cheers,
>>>> Rémi-C
>>>>
>>>>
>>>>
>>>> 2014-07-10 17:47 GMT+02:00 Alexander W. Rolek <a.rolek at gmail.com>:
>>>>
>>>> Remi -
>>>>>
>>>>> Thanks again for the quick response. I dug into this a bit more last
>>>>> night, and apparently the issue was the distance I was using with the
>>>>> ST_DWithin call. When I had a projection under 2230, the unit of measure
>>>>> was feet. With 4326 it appears to be degrees. I kept the distance of 100 on
>>>>> both tests, so the 2230 project returned quick, but a distance of 100 on a
>>>>> 4326 projection is very large so it was returning my whole table. When I
>>>>> changed the distance to something small (0.0009) my query is now fast. ;-)
>>>>>
>>>>> My next challenge will be figuring out the best way to convert between
>>>>> degrees & meters. I understand the geography type is better for queries
>>>>> like this, but I'm already using the geometry type for other queries. Is it
>>>>> common to have both geometry and geography for a record in a table? I know
>>>>> I can cast using ::geography but the queries are dramatically slower.
>>>>>
>>>>> Alex
>>>>>
>>>>>
>>>>> On Thu, Jul 10, 2014 at 1:11 AM, Rémi Cura <remi.cura at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hey,
>>>>>> If I take into account your last mail,
>>>>>> you probably have forget to transform your data.
>>>>>>
>>>>>> If you have a table in srid 4326
>>>>>> , you can't use a transform in your querry if you want it to uses
>>>>>> indexes.
>>>>>> OR you have to buil and index like this :
>>>>>> CREATE INDEX ON sdgis.parcels4326 USING GIST (ST_Transform(geom,
>>>>>> 4326));
>>>>>>
>>>>>> So can you confirm thatl your geom have the correct srid in the
>>>>>> correct table?
>>>>>>
>>>>>> SELECT DISTINCT ST_SRID(geom)
>>>>>> FROM sdgis.parcels ;
>>>>>>
>>>>>> and
>>>>>> SELECT DISTINCT ST_SRID(geom)
>>>>>> FROM sdgis.parcels4326 ;
>>>>>>
>>>>>> You uses pgadmin , so instead of running your querry
>>>>>> , run it with "explain analyse" (Select the text of your querry, then
>>>>>> press shift+F7).
>>>>>> You should see a graphic explaining what it happening. You should
>>>>>> look for sequential scans.
>>>>>>
>>>>>> You can also try the same querry without CTE :
>>>>>>
>>>>>> SELECT p.*
>>>>>> FROM (
>>>>>>  SELECT geom
>>>>>>  FROM sdgis.parcels4326 AS parcels
>>>>>>  WHERE apn = '3500600300
>>>>>> ) AS mip
>>>>>> INNER JOIN sdgis.parcels4326 AS p ON
>>>>>> (ST_DWITHIN(mip.geom,p.geom,100)=TRUE);
>>>>>>
>>>>>>
>>>>>> Of course vacuum analyze both table before running the querry.
>>>>>> It would be difficult to help you more without detailed information
>>>>>> (tables declarations, querry used, result of explain analyse), because
>>>>>> slowness can come from several reasons .
>>>>>>
>>>>>> Here is a link of requirements for people having slow querry on
>>>>>> postgres mailing list (
>>>>>> https://wiki.postgresql.org/wiki/Slow_Query_Questions).
>>>>>> I'm not guru so I couldn't use all this informations but that give
>>>>>> you an idea of how many reasons of slow query they can be
>>>>>>
>>>>>> Cheers,
>>>>>> Rémi-C
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ""
>>>>>> I tried the query that Remi suggested (thank you!) and it works and
>>>>>> doesn't work. I have the exact same table in two different projections
>>>>>> (2230, 4326). When I run the ST_Dwithin query against the 2230 table, I get
>>>>>> results in under 100 ms. When I run the same query against the 4326
>>>>>> projection it still takes around 270 seconds! I have confirmed that I have
>>>>>> the gist index on the 4326 table. Any ideas why the 4326 projection would
>>>>>> be dramatically slower?
>>>>>>
>>>>>> ""
>>>>>>
>>>>>>
>>>>>> 2014-07-09 18:38 GMT+02:00 Alexander W. Rolek <a.rolek at gmail.com>:
>>>>>>
>>>>>> Remi -
>>>>>>>
>>>>>>> Thanks for the quick response. Sorry to respond to you directly, but
>>>>>>> my message settings are set on digest so I can't respond to the thread yet.
>>>>>>>
>>>>>>> I went through your steps, and have built out my query, but it's
>>>>>>> still taking around 270 seconds to run the query, and it's returning ever
>>>>>>> record in my table. Here's my query:
>>>>>>>
>>>>>>> WITH my_input_polygon AS (
>>>>>>> SELECT geom
>>>>>>> FROM sdgis.parcels4326 AS parcels
>>>>>>>  WHERE apn = '3500600300'
>>>>>>> )
>>>>>>> SELECT p.*
>>>>>>> FROM my_input_polygon AS mip
>>>>>>> INNER JOIN sdgis.parcels4326 AS p ON
>>>>>>> (ST_DWITHIN(mip.geom,p.geom,100)=TRUE);
>>>>>>>
>>>>>>> I also have indexes on the apn and the geom columns (see attached
>>>>>>> screen shots)
>>>>>>>
>>>>>>> Any ideas what I'm missing here?
>>>>>>>
>>>>>>> Thanks again for the help
>>>>>>>
>>>>>>> --
>>>>>>> Alexander W. Rolek
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Alexander W. Rolek
>>>>> 303-829-9989
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Alexander W. Rolek
>>> 303-829-9989
>>>
>>
>>
>


-- 
Alexander W. Rolek
303-829-9989
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20140710/2d7f654d/attachment.html>


More information about the postgis-users mailing list