[postgis-users] Large geometry issue

Mark Leslie mark at refractions.net
Fri Jul 6 13:15:57 PDT 2007


Greg,
I've noticed that you're using a Geos release candidate.  I've tried
this query against PostGIS 1.2.1 with the official Geos 3.0.0 and was
given the 241 results for my trouble.  Are you able to upgrade the Geos
on that machine and give it a try?
Mark

Gregory Williamson wrote:
> I think the zipped file I first sent should have all of that data ... this was a test against the original full tables but I have reproduced it against the smaller sample, which has all of the small tiles with the bounding box of the large geometry.
> 
> Using the intersection in JUMP seems to come up with a number very similar to using the distance(x,y) = 0 test; a rough count by hand seems to believe the 241 count.
> 
> I can resend the data if need be; let me know if there is any other info I can provide ... maybe we have a busted build ? Both servers are postgres 8.2.4 on slightly different flavors of linux.
> 
> Thanks!
> 
> Greg W.
> 
> -----Original Message-----
> From: postgis-users-bounces at postgis.refractions.net on behalf of Paul Ramsey
> Sent: Thu 7/5/2007 10:29 PM
> To: PostGIS Users Discussion
> Cc: Mark Leslie
> Subject: Re: [postgis-users] Large geometry issue
>  
> That is indeed concerning, can you send us the big shape and a few of  
> the small shapes that 2.2.3 is turning up as containments and 3.0.0  
> is not?
> 
> P
> 
> On 5-Jul-07, at 6:11 PM, Gregory Williamson wrote:
> 
>> Having has a little time to look at this in detail I can see why  
>> this is ugly ... a very long sinuous shape vs a ton of small  
>> rectangles, so the bounding box does get a lot of candidates which  
>> then have to be winnowed.
>>
>> I do find myself wondering at this result:
>>
>>  POSTGIS="1.2.1" GEOS="3.0.0rc4-CAPI-1.3.3" USE_STATS
>> select count(*) from wsc_candidates w, oli_req x WHERE w.geometry  
>> && x.bbox AND contains(x.oli_req_geom,w.geometry) AND x.oli_req_id  
>> = 114672;
>>  count
>> -------
>>      0
>> (1 row)
>>
>> Time: 4347.776 ms
>>
>> versus
>> POSTGIS="1.2.1" GEOS="2.2.3-CAPI-1.1.1" PROJ="Rel. 4.4.9, 29 Oct  
>> 2004" USE_STATS
>> # select count(*) from wsc_candidates w, oli_req x WHERE w.geometry  
>> && x.bbox AND contains(x.oli_req_geom,w.geometry) AND x.oli_req_id  
>> = 114672;
>>  count
>> -------
>>    241
>> (1 row)
>>
>> Time: 15956.425 ms
>>
>> Not the time so much -- the box is a bit older -- but the disparity  
>> in rows. 241 seems more likely from eyeballing the data in JUMP ...  
>> or have I overlooked something obvious ?
>>
>> Thanks,
>>
>> Greg W.
>>
>>
>> -----Original Message-----
>> From: postgis-users-bounces at postgis.refractions.net on behalf of  
>> Martin Davis
>> Sent: Thu 7/5/2007 5:32 PM
>> To: PostGIS Users Discussion
>> Subject: Re: [postgis-users] Large geometry issue
>>
>> No problem.  Actually it's me being blind - I just saw the attachment
>> with the test data.
>>
>> 2530 vertices isn't a very big polygon.  So this isn't a problem with
>> lots of holes or shells, it simply reveals that we're not  
>> extracting the
>> maximum possible performance from a querying involving comparing many
>> small geoms against a single large one.  Stay tuned - hopefully we'll
>> have a better story for this shortly.
>>
>> Martin
>>
>> Gregory Williamson wrote:
>>> Martin,
>>>
>>> Sorry for the lack of coherence -- 2530 vertices in this test  
>> polygon
>>> ... it should be in the original posting I sent, an attached zip  
>> file
>>> with two tables; this is the single entry in the oli_req table. The
>>> other table has some 13000 rows of polygons which are either
>>> completely within or overlapping with the large poly.
>>>
>>> If you can't get at that let me know and I'll send you a copy ...
>>>
>>> Greg W.
>>>
>>>
>>> -----Original Message-----
>>> From: postgis-users-bounces at postgis.refractions.net on behalf of
>>> Martin Davis
>>> Sent: Thu 7/5/2007 5:09 PM
>>> To: PostGIS Users Discussion
>>> Subject: Re: [postgis-users] Large geometry issue
>>>
>>> Greg,
>>>
>>> Can you elaborate on your statement "2530 in a sinlge polygon"?   
>> Do you
>>> mean holes or subPolygons in a MultiPolygon?
>>>
>>> Either way, currently JTS is not optimized for working with very  
>> large
>>> numbers of holes and sub-polygons.  As Paul says, we are currently
>>> working on improving this.  Hopefully we can at least match the
>>> performance of the Other DMBS(es) for this situation!
>>>
>>> Would it be possible for me to obtain this geometry for testing?
>>>
>>> Martin
>>>
>>>
>>>
>>> Gregory Williamson wrote:
>>>> Dear peoples,
>>>>
>>>> I have a problem with a query that uses an absurdly large geometry
>>>> (2530 in a single polygon). This is srid -1 (part of a large  
>> test of
>>>> postgres vs some other database product). Everything has been  
>> vacuumed
>>>> and analyzed.
>>>>
>>>> The initial search to find candidates in a target table is  
>> quite fast:
>>>> catest=# select count(*) from wtm_sub_cell w, order_line_item x  
>> WHERE
>>>> x.bbox && w.geometry AND x.id_as_int = 114672;
>>>>  count
>>>> -------
>>>>  13168
>>>> (1 row)
>>>>
>>>> Time: 9.472 ms
>>>>
>>>> Trying to get the list narrowed to geometries that are completely
>>>> contained by the requested shape is slow:
>>>> catest=# select count(*) from wtm_sub_cell w, order_line_item x  
>> WHERE
>>>> x.bbox && w.geometry AND distance(x.geometry,w.geometry) = 0 and
>>>> x.id_as_int = 114672;
>>>>  count
>>>> -------
>>>>   1112
>>>> (1 row)
>>>>
>>>> Time: 69277.780 ms
>>>>
>>>> So I have two questions:
>>>>   a) anything better to use than "distance(x,y) = 0) ? I tried
>>>> st_within -- it is about the same speed but returns no polys,  
>> which is
>>>> strange to me, but I also haven't looked at these in detail  
>> yet. For
>>>> example:
>>>> catest=# select count(*) from wtm_sub_cell w, order_line_item x  
>> WHERE
>>>> x.bbox && w.geometry AND st_within(x.geometry,w.geometry) and
>>>> x.id_as_int = 114672;
>>>>  count
>>>> -------
>>>>      0
>>>> (1 row)
>>>>
>>>> Time: 1173.185 ms
>>>> (same results with st_within(w.geometry,x.geometry):
>>>> catest=# select count(*) from wtm_sub_cell w, order_line_item x  
>> WHERE
>>>> x.bbox && w.geometry AND st_within(w.geometry,x.geometry) and
>>>> x.id_as_int = 114672;
>>>>  count
>>>> -------
>>>>      0
>>>> (1 row)
>>>>
>>>>
>>>>   b) anything I can do to speed things up ? I have tried  
>> boosting work
>>>> mem to 16 megs (from 1) and it made no apparent difference.
>>>>
>>>>
>>>>
>>>> I have a self contained test case that shows the same behavior  
>> -- the
>>>> one large poly and all the candidates in another table.  
>> Apologies for
>>>> the size; hopefully it's not been mangled in the transfers.
>>>>
>>>> Explain analyze of the sample (the sequential is sensible since  
>> there
>>>> is only one row in the table):
>>>> catest=# explain analyze select count(*) from wsc_candidates w,
>>>> oli_req x WHERE w.geometry && x.bbox AND
>>>> distance(w.geometry,x.oli_req_geom) > 0 AND x.oli_req_id = 114672;
>>>>
>>>> QUERY PLAN
>>>>
>>>  
>> ---------------------------------------------------------------------- 
>> ---------------------------------------------------------------------- 
>> -----------
>>>>  Aggregate  (cost=20.28..20.29 rows=1 width=0) (actual
>>>> time=77232.858..77232.859 rows=1 loops=1)
>>>>    ->  Nested Loop  (cost=0.00..9.30 rows=4389 width=0) (actual
>>>> time=6.389..77221.506 rows=12056 loops=1)
>>>>          Join Filter: (distance(w.geometry, x.oli_req_geom) >
>>>> 0::double precision)
>>>>          ->  Seq Scan on oli_req x  (cost=0.00..1.01 rows=1
>>>> width=40602) (actual time=0.007..0.009 rows=1 loops=1)
>>>>                Filter: (oli_req_id = 114672)
>>>>          ->  Index Scan using wsc_c_spatial_ndx on  
>> wsc_candidates w
>>>> (cost=0.00..8.27 rows=1 width=109) (actual time=0.022..25.991
>>>> rows=13168 loops=1)
>>>>                Index Cond: (w.geometry && x.bbox)
>>>>                Filter: (w.geometry && x.bbox)
>>>>  Total runtime: 77232.901 ms
>>>> (9 rows)
>>>>
>>>> Time: 77233.773 ms
>>>>
>>>>
>>>> And for the real thing:
>>>> catest=# explain analyze select count(*) from wtm_sub_cell w,
>>>> order_line_item x WHERE w.geometry && x.bbox AND
>>>> distance(w.geometry,x.geometry) = 0 AND x.id_as_int = 114672;
>>>  
>>>                                                                   
>> QUERY
>>>> PLAN
>>>>
>>>  
>> ---------------------------------------------------------------------- 
>> ----------------------------------------------------------------------
>>>>  Aggregate  (cost=141.83..141.84 rows=1 width=0) (actual
>>>> time=77457.587..77457.588 rows=1 loops=1)
>>>>    ->  Nested Loop  (cost=5.99..141.83 rows=1 width=0) (actual
>>>> time=15.682..77456.541 rows=1112 loops=1)
>>>>          Join Filter: (distance(w.geometry, x.geometry) =  
>> 0::double
>>>> precision)
>>>>          ->  Index Scan using oli_id_ndx on order_line_item x
>>>> (cost=0.00..8.30 rows=1 width=383) (actual time=0.012..0.018  
>> rows=1
>>>> loops=1)
>>>>                Index Cond: (id_as_int = 114672)
>>>>          ->  Bitmap Heap Scan on wtm_sub_cell w   
>> (cost=5.99..132.97
>>>> rows=32 width=109) (actual time=2.988..21.796 rows=13168 loops=1)
>>>>                Filter: (w.geometry && x.bbox)
>>>>                ->  Bitmap Index Scan on wsc_geom_idx1
>>>> (cost=0.00..5.98 rows=32 width=0) (actual time=2.828..2.828  
>> rows=13168
>>>> loops=1)
>>>>                      Index Cond: (w.geometry && x.bbox)
>>>>  Total runtime: 77457.633 ms
>>>> (10 rows)
>>>>
>>>> Time: 77458.458 ms
>>>>
>>>>
>>>> The tables involved by size:
>>>> catest=# select count(*) from wsc_candidates;
>>>>  count
>>>> -------
>>>>  13168
>>>> (1 row)
>>>>
>>>> Time: 2.586 ms
>>>> catest=# select count(*) from oli_req;
>>>>  count
>>>> -------
>>>>      1
>>>> (1 row)
>>>>
>>>> Time: 0.193 ms
>>>> catest=# select count(*) from wtm_sub_cell;
>>>>   count
>>>> ---------
>>>>  6399928
>>>> (1 row)
>>>>
>>>> Time: 1776.508 ms
>>>> catest=# select count(*) from order_line_item;
>>>>  count
>>>> --------
>>>>  395921
>>>> (1 row)
>>>>
>>>> Time: 176.083 ms
>>>>
>>>>
>>>> Many thanks for your time and bandwidth!
>>>>
>>>> Greg Williamson
>>>> Senior DBA
>>>> GlobeXplorer LLC, a DigitalGlobe company
>>>>
>>>> Confidentiality Notice: This e-mail message, including any
>>>> attachments, is for the sole use of the intended recipient(s)  
>> and may
>>>> contain confidential and privileged information and must be  
>> protected
>>>> in accordance with those provisions. Any unauthorized review, use,
>>>> disclosure or distribution is prohibited. If you are not the  
>> intended
>>>> recipient, please contact the sender by reply e-mail and  
>> destroy all
>>>> copies of the original message.
>>>>
>>>> (My corporate masters made me say this.)
>>>>
>>>>  
>> ---------------------------------------------------------------------- 
>> --
>>>> _______________________________________________
>>>> postgis-users mailing list
>>>> postgis-users at postgis.refractions.net
>>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>>>
>>> --
>>> Martin Davis
>>> Senior Technical Architect
>>> Refractions Research, Inc.
>>> (250) 383-3022
>>>
>>> _______________________________________________
>>> postgis-users mailing list
>>> postgis-users at postgis.refractions.net
>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>>
>>>  
>> ---------------------------------------------------------------------- 
>> --
>>> _______________________________________________
>>> postgis-users mailing list
>>> postgis-users at postgis.refractions.net
>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>>
>> --
>> Martin Davis
>> Senior Technical Architect
>> Refractions Research, Inc.
>> (250) 383-3022
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
> 
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users




More information about the postgis-users mailing list