[postgis-users] Does anyone have multipolygons with large numbersof polygons

Fri Apr 18 11:20:17 PDT 2008

Obe, Regina wrote:
> Not sure the answer to your questions since I don't know much about 
> the internals of PostgreSQL.  I always assumed by ref for large objects. 
>  
> We could try to force it to calculate the numpoints call only once and 
> see if it makes a difference in timings
>  
> Something of the form
>  
> SELECT pointn(t.the_geom, g.n) As poly
> FROM (SELECT the_geom, numpoints(the_geom) As np FROM kneufeld.tmp) t 
> CROSS JOIN pgis_gs g
> WHERE g.n <= np;
>  

EXPLAIN ANALYZE
SELECT pointn(t.the_geom, g.n) As poly
FROM (SELECT the_geom, numpoints(the_geom) As np FROM kneufeld.tmp) t
  CROSS JOIN pgis_gs g
WHERE g.n <= np;
                                                                QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..491.02 rows=10000 width=339665) (actual 
time=7.498..78410.522 rows=21227 loops=1)
   ->  Seq Scan on tmp  (cost=0.00..1.01 rows=1 width=339661) (actual 
time=0.009..0.016 rows=1 loops=1)
   ->  Index Scan using pgis_gs_pkey on pgis_gs g  (cost=0.00..315.01 
rows=10000 width=4) (actual time=0.048..222.032 rows=21227 loops=1)
         Index Cond: (g.n <= numpoints(tmp.the_geom))
 Total runtime: 78518.345 ms
 Total runtime: 79292.140 ms
Time: 78779.260 ms

Nope... didn't work. The query planner came up with the same query (for 
the 20000 point linestring).
-- This is our first attempt
EXPLAIN ANALYZE
SELECT pointn(t.the_geom, g.n) As poly
FROM kneufeld.tmp t
  CROSS JOIN pgis_gs g
WHERE g.n <= numpoints(the_geom);
                                                                QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..491.02 rows=10000 width=339665) (actual 
time=7.472..78697.168 rows=21227 loops=1)
   ->  Seq Scan on tmp t  (cost=0.00..1.01 rows=1 width=339661) (actual 
time=0.011..0.016 rows=1 loops=1)
   ->  Index Scan using pgis_gs_pkey on pgis_gs g  (cost=0.00..315.01 
rows=10000 width=4) (actual time=0.047..211.865 rows=21227 loops=1)
         Index Cond: (g.n <= numpoints(t.the_geom))
 Total runtime: 78805.605 ms

And this also didn't work...

CREATE TABLE kneufeld.with_count AS
  SELECT the_geom, numpoints(the_geom) As np FROM kneufeld.tmp;

EXPLAIN ANALYZE
SELECT pointn(t.the_geom, g.n) As poly
FROM kneufeld.with_count t                                          
  CROSS JOIN pgis_gs g
WHERE g.n <= np;

                                                                QUERY 
PLAN                                                               
------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..466.01 rows=10000 width=339665) (actual 
time=3.764..78382.753 rows=21227 loops=1)
   ->  Seq Scan on with_count t  (cost=0.00..1.01 rows=1 width=339665) 
(actual time=0.008..0.013 rows=1 loops=1)
   ->  Index Scan using pgis_gs_pkey on pgis_gs g  (cost=0.00..315.00 
rows=10000 width=4) (actual time=0.048..215.288 rows=21227 loops=1)
         Index Cond: (g.n <= t.np)
 Total runtime: 78487.864 ms
 Total runtime: 78396.653 ms

Any other thoughts?
>  
> Also why are the timings for
> -- A linestring with 10532 points 
>  
> exactly the same in both cases or was that a bad cut and paste job :)
:) Good catch.. yes bad cut and paste.  I ran again and got these values.
 Total runtime: 18541.358 ms
 Total runtime: 18636.428 ms
 Total runtime: 18630.422 ms
>  
> Thanks,
> Regina
>  
Cheers,
Kevin