[postgis-users] speeding up simple pt-in-poly lookups

Paul Ramsey pramsey at opengeo.org
Fri Dec 23 11:06:36 PST 2011


It is, and it's counter-intuitive given our expected performance from
prepared geometries. This could be a side-effect of our required cache
coherence test that sits in front of the actual prepared test.

P.

On Thu, Dec 22, 2011 at 8:55 PM, Martin Davis <mtnclimb at telus.net> wrote:
> Interesting that using ST_Simplify makes such a big difference.
>
> Would it be possible for you to post the sizes of the target geometries
> before and after simplification/buffering?  This might help in determining a
> better solution for the future.
>
>
> On 12/21/2011 12:02 PM, Puneet Kishor wrote:
>>
>> Thanks everyone. Learned a lot. For now, I have decided to not go down the
>> chop-and-reconstruct path. The following query does the job in 1.8 secs
>> which is a pretty nice improvement over the previous 200+ seconds.
>>
>> SELECT Count(*)
>> FROM collections c
>> WHERE ST_Intersects(
>>        c.the_geom, (
>>                SELECT ST_Buffer(ST_Simplify(n.the_geom, 0.75), 0.75)
>>                FROM base.continents n
>>                WHERE n.abbrev = 'NA'
>>        )
>> );
>>
>>
>> On Dec 21, 2011, at 11:38 AM, pcreso at pcreso.com wrote:
>>
>>> Puneet,
>>>
>>> Chopping polygons is pretty simple, with a grid&  st_intersection(), but
>>> you can certainly generalise polygons to reduce the number of vertices&
>>>  size of objects to de-toast... beware however that if you do this then you
>>> are actually moving the polygon boundary,&  therefore a point very near a
>>> boundary may be inside the original country polygon but outside the
>>> generalised/simplified one.
>>>
>>>
>>> You can address this by simplifying a buffer of the polygons, with the
>>> buffer very slightly larger than the simplify distance, so that every
>>> simplified version fully contains the original, but you will also have to
>>> check against the original polygons to confirm the point is genuinely inside
>>> the original.
>>>
>>> As an alternative approach, you might also try selecting points where the
>>> distance from a polygon is zero, as the ST_distance uses stabbing line
>>> algorithm, and may be faster. The distance will be non-zero only for points
>>> outside the polygon.
>>>
>>> Cheers,
>>>
>>>   Brent Wood
>>>
>>> On 21/12/2011, at 12:06 PM, Mr. Puneet Kishor wrote:
>>>
>>>> On Dec 20, 2011, at 9:48 PM, Martin Davis wrote:
>>>>
>>>>> For more detail check out this thread on the same issue:
>>>>>
>>>>>
>>>>> http://postgis.refractions.net/pipermail/postgis-users/2011-November/031345.html
>>>>
>>>>
>>>> Thanks. Chopping up my coverage into hundreds of small regions is the
>>>> last avenue I want to try. Going by the text of that email, it seems that
>>>> "few, large, regions with many vertices (may be) the problem." I will try
>>>> generalizing my continents so that I have "few, large regions with *very
>>>> few* vertices" and see if that speeds up the SELECTs.
>>>>
>>>>
>>>>> On 12/20/2011 5:28 PM, Puneet Kishor wrote:
>>>>>>
>>>>>> On Dec 20, 2011, at 7:21 PM, Paul Ramsey wrote:
>>>>>>
>>>>>>> Chop up the continents into smaller pieces.
>>>>>>>
>>>>>> hmmm... I am not sure I understand the above. And then what? UNION
>>>>>> each smaller piece query?
>>>>>>
>>>>>>
>>>>>>> On Tue, Dec 20, 2011 at 4:35 PM, Puneet Kishor<punk.kish at gmail.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> This is probably a really basic question... my ST_Within or
>>>>>>>> ST_Intersects selecting points in a continent are way too slow (both take
>>>>>>>> upward of 200 secs).
>>>>>>>>
>>>>>>>>        SELECT Count(c_id)
>>>>>>>>        FROM c, continents n
>>>>>>>>        WHERE ST_Intersects(c.the_geom, n.the_geom) AND
>>>>>>>>                n.continent = 'North America';
>>>>>>>>
>>>>>>>>
>>>>>>>> Both tables have gist indexes on the geometries. The above query has
>>>>>>>> the following plan
>>>>>>>>
>>>>>>>> "Aggregate  (cost=9.66..9.67 rows=1 width=4)"
>>>>>>>> "  ->    Nested Loop  (cost=0.00..9.66 rows=1 width=4)"
>>>>>>>> "        Join Filter: _st_intersects(c.the_geom, n.the_geom)"
>>>>>>>> "        ->    Seq Scan on continents n  (cost=0.00..1.10 rows=1
>>>>>>>> width=32)"
>>>>>>>> "              Filter: ((continent)::text = 'North America'::text)"
>>>>>>>> "        ->    Index Scan using pbdb__collections_the_geom on
>>>>>>>> collections c  (cost=0.00..8.30 rows=1 width=104)"
>>>>>>>> "              Index Cond: (c.the_geom&&   n.the_geom)"
>>>>>>>>
>>>>>>>> The table c has approx 120K rows, and the continents table has 8
>>>>>>>> rows.Suggestions on how I can improve this? Yes, the computer is otherwise
>>>>>>>> very swift and modern.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Puneet Kishor
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>>
>> -----
>> No virus found in this message.
>> Checked by AVG - www.avg.com
>> Version: 2012.0.1890 / Virus Database: 2109/4694 - Release Date: 12/21/11
>>
>>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users



More information about the postgis-users mailing list