[postgis-users] Large geometry issue

Gregory Williamson Gregory.Williamson at digitalglobe.com
Thu Jul 5 21:53:43 PDT 2007


These are some of the ids that the one comparison found:

gex_runtime=# select w.id_as_int from wsc_candidates w, oli_req x WHERE w.geometry && x.bbox AND contains(x.oli_req_geom,w.geometry) AND x.oli_req_id = 114672 order by 1;
 id_as_int
-----------
   5905594
   5905626
   5905704
   5905705
   5905706
   5905719
   5905725
   5905835
   5905837
   5905838
   5906142
   5906143
   5906146
   5906153
   5906159
   5906159
   5906175
   5906176
<...>
  6139126
   6139417
   6139418
   6139426
   6139427
(241 rows)

GSW



-----Original Message-----
From: postgis-users-bounces at postgis.refractions.net on behalf of Paul Ramsey
Sent: Thu 7/5/2007 10:29 PM
To: PostGIS Users Discussion
Cc: Mark Leslie
Subject: Re: [postgis-users] Large geometry issue
 
That is indeed concerning, can you send us the big shape and a few of  
the small shapes that 2.2.3 is turning up as containments and 3.0.0  
is not?

P

On 5-Jul-07, at 6:11 PM, Gregory Williamson wrote:

> Having has a little time to look at this in detail I can see why  
> this is ugly ... a very long sinuous shape vs a ton of small  
> rectangles, so the bounding box does get a lot of candidates which  
> then have to be winnowed.
>
> I do find myself wondering at this result:
>
>  POSTGIS="1.2.1" GEOS="3.0.0rc4-CAPI-1.3.3" USE_STATS
> select count(*) from wsc_candidates w, oli_req x WHERE w.geometry  
> && x.bbox AND contains(x.oli_req_geom,w.geometry) AND x.oli_req_id  
> = 114672;
>  count
> -------
>      0
> (1 row)
>
> Time: 4347.776 ms
>
> versus
> POSTGIS="1.2.1" GEOS="2.2.3-CAPI-1.1.1" PROJ="Rel. 4.4.9, 29 Oct  
> 2004" USE_STATS
> # select count(*) from wsc_candidates w, oli_req x WHERE w.geometry  
> && x.bbox AND contains(x.oli_req_geom,w.geometry) AND x.oli_req_id  
> = 114672;
>  count
> -------
>    241
> (1 row)
>
> Time: 15956.425 ms
>
> Not the time so much -- the box is a bit older -- but the disparity  
> in rows. 241 seems more likely from eyeballing the data in JUMP ...  
> or have I overlooked something obvious ?
>
> Thanks,
>
> Greg W.
>
>
> -----Original Message-----
> From: postgis-users-bounces at postgis.refractions.net on behalf of  
> Martin Davis
> Sent: Thu 7/5/2007 5:32 PM
> To: PostGIS Users Discussion
> Subject: Re: [postgis-users] Large geometry issue
>
> No problem.  Actually it's me being blind - I just saw the attachment
> with the test data.
>
> 2530 vertices isn't a very big polygon.  So this isn't a problem with
> lots of holes or shells, it simply reveals that we're not  
> extracting the
> maximum possible performance from a querying involving comparing many
> small geoms against a single large one.  Stay tuned - hopefully we'll
> have a better story for this shortly.
>
> Martin
>
> Gregory Williamson wrote:
> >
> > Martin,
> >
> > Sorry for the lack of coherence -- 2530 vertices in this test  
> polygon
> > ... it should be in the original posting I sent, an attached zip  
> file
> > with two tables; this is the single entry in the oli_req table. The
> > other table has some 13000 rows of polygons which are either
> > completely within or overlapping with the large poly.
> >
> > If you can't get at that let me know and I'll send you a copy ...
> >
> > Greg W.
> >
> >
> > -----Original Message-----
> > From: postgis-users-bounces at postgis.refractions.net on behalf of
> > Martin Davis
> > Sent: Thu 7/5/2007 5:09 PM
> > To: PostGIS Users Discussion
> > Subject: Re: [postgis-users] Large geometry issue
> >
> > Greg,
> >
> > Can you elaborate on your statement "2530 in a sinlge polygon"?   
> Do you
> > mean holes or subPolygons in a MultiPolygon?
> >
> > Either way, currently JTS is not optimized for working with very  
> large
> > numbers of holes and sub-polygons.  As Paul says, we are currently
> > working on improving this.  Hopefully we can at least match the
> > performance of the Other DMBS(es) for this situation!
> >
> > Would it be possible for me to obtain this geometry for testing?
> >
> > Martin
> >
> >
> >
> > Gregory Williamson wrote:
> > >
> > > Dear peoples,
> > >
> > > I have a problem with a query that uses an absurdly large geometry
> > > (2530 in a single polygon). This is srid -1 (part of a large  
> test of
> > > postgres vs some other database product). Everything has been  
> vacuumed
> > > and analyzed.
> > >
> > > The initial search to find candidates in a target table is  
> quite fast:
> > > catest=# select count(*) from wtm_sub_cell w, order_line_item x  
> WHERE
> > > x.bbox && w.geometry AND x.id_as_int = 114672;
> > >  count
> > > -------
> > >  13168
> > > (1 row)
> > >
> > > Time: 9.472 ms
> > >
> > > Trying to get the list narrowed to geometries that are completely
> > > contained by the requested shape is slow:
> > > catest=# select count(*) from wtm_sub_cell w, order_line_item x  
> WHERE
> > > x.bbox && w.geometry AND distance(x.geometry,w.geometry) = 0 and
> > > x.id_as_int = 114672;
> > >  count
> > > -------
> > >   1112
> > > (1 row)
> > >
> > > Time: 69277.780 ms
> > >
> > > So I have two questions:
> > >   a) anything better to use than "distance(x,y) = 0) ? I tried
> > > st_within -- it is about the same speed but returns no polys,  
> which is
> > > strange to me, but I also haven't looked at these in detail  
> yet. For
> > > example:
> > > catest=# select count(*) from wtm_sub_cell w, order_line_item x  
> WHERE
> > > x.bbox && w.geometry AND st_within(x.geometry,w.geometry) and
> > > x.id_as_int = 114672;
> > >  count
> > > -------
> > >      0
> > > (1 row)
> > >
> > > Time: 1173.185 ms
> > > (same results with st_within(w.geometry,x.geometry):
> > > catest=# select count(*) from wtm_sub_cell w, order_line_item x  
> WHERE
> > > x.bbox && w.geometry AND st_within(w.geometry,x.geometry) and
> > > x.id_as_int = 114672;
> > >  count
> > > -------
> > >      0
> > > (1 row)
> > >
> > >
> > >   b) anything I can do to speed things up ? I have tried  
> boosting work
> > > mem to 16 megs (from 1) and it made no apparent difference.
> > >
> > >
> > >
> > > I have a self contained test case that shows the same behavior  
> -- the
> > > one large poly and all the candidates in another table.  
> Apologies for
> > > the size; hopefully it's not been mangled in the transfers.
> > >
> > > Explain analyze of the sample (the sequential is sensible since  
> there
> > > is only one row in the table):
> > > catest=# explain analyze select count(*) from wsc_candidates w,
> > > oli_req x WHERE w.geometry && x.bbox AND
> > > distance(w.geometry,x.oli_req_geom) > 0 AND x.oli_req_id = 114672;
> > >
> > > QUERY PLAN
> > >
> >  
> ---------------------------------------------------------------------- 
> ---------------------------------------------------------------------- 
> -----------
> > >  Aggregate  (cost=20.28..20.29 rows=1 width=0) (actual
> > > time=77232.858..77232.859 rows=1 loops=1)
> > >    ->  Nested Loop  (cost=0.00..9.30 rows=4389 width=0) (actual
> > > time=6.389..77221.506 rows=12056 loops=1)
> > >          Join Filter: (distance(w.geometry, x.oli_req_geom) >
> > > 0::double precision)
> > >          ->  Seq Scan on oli_req x  (cost=0.00..1.01 rows=1
> > > width=40602) (actual time=0.007..0.009 rows=1 loops=1)
> > >                Filter: (oli_req_id = 114672)
> > >          ->  Index Scan using wsc_c_spatial_ndx on  
> wsc_candidates w
> > > (cost=0.00..8.27 rows=1 width=109) (actual time=0.022..25.991
> > > rows=13168 loops=1)
> > >                Index Cond: (w.geometry && x.bbox)
> > >                Filter: (w.geometry && x.bbox)
> > >  Total runtime: 77232.901 ms
> > > (9 rows)
> > >
> > > Time: 77233.773 ms
> > >
> > >
> > > And for the real thing:
> > > catest=# explain analyze select count(*) from wtm_sub_cell w,
> > > order_line_item x WHERE w.geometry && x.bbox AND
> > > distance(w.geometry,x.geometry) = 0 AND x.id_as_int = 114672;
> >  
> >                                                                   
> QUERY
> > > PLAN
> > >
> >  
> ---------------------------------------------------------------------- 
> ----------------------------------------------------------------------
> > >  Aggregate  (cost=141.83..141.84 rows=1 width=0) (actual
> > > time=77457.587..77457.588 rows=1 loops=1)
> > >    ->  Nested Loop  (cost=5.99..141.83 rows=1 width=0) (actual
> > > time=15.682..77456.541 rows=1112 loops=1)
> > >          Join Filter: (distance(w.geometry, x.geometry) =  
> 0::double
> > > precision)
> > >          ->  Index Scan using oli_id_ndx on order_line_item x
> > > (cost=0.00..8.30 rows=1 width=383) (actual time=0.012..0.018  
> rows=1
> > > loops=1)
> > >                Index Cond: (id_as_int = 114672)
> > >          ->  Bitmap Heap Scan on wtm_sub_cell w   
> (cost=5.99..132.97
> > > rows=32 width=109) (actual time=2.988..21.796 rows=13168 loops=1)
> > >                Filter: (w.geometry && x.bbox)
> > >                ->  Bitmap Index Scan on wsc_geom_idx1
> > > (cost=0.00..5.98 rows=32 width=0) (actual time=2.828..2.828  
> rows=13168
> > > loops=1)
> > >                      Index Cond: (w.geometry && x.bbox)
> > >  Total runtime: 77457.633 ms
> > > (10 rows)
> > >
> > > Time: 77458.458 ms
> > >
> > >
> > > The tables involved by size:
> > > catest=# select count(*) from wsc_candidates;
> > >  count
> > > -------
> > >  13168
> > > (1 row)
> > >
> > > Time: 2.586 ms
> > > catest=# select count(*) from oli_req;
> > >  count
> > > -------
> > >      1
> > > (1 row)
> > >
> > > Time: 0.193 ms
> > > catest=# select count(*) from wtm_sub_cell;
> > >   count
> > > ---------
> > >  6399928
> > > (1 row)
> > >
> > > Time: 1776.508 ms
> > > catest=# select count(*) from order_line_item;
> > >  count
> > > --------
> > >  395921
> > > (1 row)
> > >
> > > Time: 176.083 ms
> > >
> > >
> > > Many thanks for your time and bandwidth!
> > >
> > > Greg Williamson
> > > Senior DBA
> > > GlobeXplorer LLC, a DigitalGlobe company
> > >
> > > Confidentiality Notice: This e-mail message, including any
> > > attachments, is for the sole use of the intended recipient(s)  
> and may
> > > contain confidential and privileged information and must be  
> protected
> > > in accordance with those provisions. Any unauthorized review, use,
> > > disclosure or distribution is prohibited. If you are not the  
> intended
> > > recipient, please contact the sender by reply e-mail and  
> destroy all
> > > copies of the original message.
> > >
> > > (My corporate masters made me say this.)
> > >
> > >  
> ---------------------------------------------------------------------- 
> --
> > >
> > > _______________________________________________
> > > postgis-users mailing list
> > > postgis-users at postgis.refractions.net
> > > http://postgis.refractions.net/mailman/listinfo/postgis-users
> > >
> >
> > --
> > Martin Davis
> > Senior Technical Architect
> > Refractions Research, Inc.
> > (250) 383-3022
> >
> > _______________________________________________
> > postgis-users mailing list
> > postgis-users at postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-users
> >
> >  
> ---------------------------------------------------------------------- 
> --
> >
> > _______________________________________________
> > postgis-users mailing list
> > postgis-users at postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-users
> >
>
> --
> Martin Davis
> Senior Technical Architect
> Refractions Research, Inc.
> (250) 383-3022
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20070705/71042444/attachment.html>


More information about the postgis-users mailing list