[postgis-users] A bit of advice would be greatly appreciated

Paul Ramsey pramsey at refractions.net
Wed Apr 5 19:13:55 PDT 2006


Jim,

You have a slightly older GEOS, and you are running an intersection 
(), so it is possible you have found an infinite loop bug of some  
sort.  I would suggest an upgrade to the 2.2 GEOS series as a first  
easy step.  Your WHERE clause seems fine.  I am surprised you are  
running transform() on one of your geometries prior to area()  
calculation, but not on the results of your intersection(). It would  
seem that if you have to do it for b.the_geom you would have to do it  
for intersection(a.the_geom,b.the_geom).  That is just an aside,  
probably not your problem.

Oh, reading your explain, I am not seeing an index scan in there, so  
possibly the join selectivity is join considered high enough...  
upgrading your postgis to the 1.1 series adds some smarter join  
selectivity code, which might help your plan.

Whenever I read queries like this I think about adding raster  
processing... :)

P

On 5-Apr-06, at 2:30 PM, James G Wilkinson wrote:

>
>> You probably need to post the results of
>>
>> select postgis_full_version();
>>
>> and the explain on the select you posted below.
> Whoops, I should have realized that this would be helpful.  Of  
> note, I realize
> that I a behind on versions here.  Thanks for taking a look at this.
>
> Regards,
>
> Jim
>
> beta4=# select postgis_full_version();
>                                          postgis_full_version
> ---------------------------------------------------------------------- 
> --------------------------------
>  POSTGIS="0.9.1" GEOS="2.1.1" PROJ="Rel. 4.4.9, 29 Oct 2004"  
> USE_STATS DBPROC="0.0.1" RELPROC="0.0.1"
> (1 row)
>
> beta4=# EXPLAIN SELECT a.icell, a.jcell, SUM(AREA(INTERSECTION 
> (a.the_geom,b.the_geom)) /
>         b.area * area_fraction * AREA(TRANSFORM(b.the_geom,32768)))  
> as area
>         from regular_grid a, plant_area b
>         WHERE b.the_geom && a.the_geom and
>         DISTANCE(a.the_geom,b.the_geom) = 0
>         GROUP BY a.icell, a.jcell
>         ORDER BY a.icell, a.jcell;
>                                             QUERY PLAN
> ---------------------------------------------------------------------- 
> ---------------------------------------------------------------
> GroupAggregate  (cost=7834473491.51..7835087937.70 rows=16352  
> width=543)
> ->  Sort  (cost=7834473491.51..7834627021.30 rows=61411915 width=543)
>     Sort Key: a.icell, a.jcell
>     ->  Nested Loop  (cost=1344.52..7697000461.50 rows=61411915  
> width=543)
>         Join Filter: (("outer".the_geom && "inner".the_geom) AND  
> (distance("inner".the_geom, "outer".the_geom) = 0::double precision))
>         ->  Seq Scan on plant_area b  (cost=0.00..379664.42  
> rows=7511242 width=287)
>         ->  Materialize  (cost=1344.52..2083.04 rows=16352 width=256)
>             ->  Seq Scan on regular_grid a  (cost=0.00..769.52  
> rows=16352 width=256)
> (8 rows)
>
>>
>>
>> James G Wilkinson wrote:
>>> All,
>>>
>>> I have been monitoring this forum for some time, and it has been
>>> very helpful in my efforts.  I have run into a problem that I simply
>>> cannot seem to get a handle on.  I have two spatial data sets:
>>> regular_grid (gid integer, area double, icell integer, jcell  
>>> integer,
>>> the_geom geometry) and plant_area (gid integer, area double,
>>> area_fraction double, the_geom geometry).  regular_grid represents
>>> a 36 km by 36 km set of grid cells (polygons) that covers the
>>> continental United States and is about 10 MB in size.  plant_area is
>>> a polygon coverage that covers North America which contains
>>> the area covered by a single plant species and is about 6 GB in
>>> size (each polygon in plant_area is roughly on the order of
>>> 4,000,000 square meters).  Both coverages are in LAT/LON
>>> coordinates.  I am trying to perform the following:
>>>
>>> CREATE TABLE intersect_results AS
>>>   SELECT a.icell, a.jcell, SUM(AREA(INTERSECTION 
>>> (a.the_geom,b.the_geom)) /
>>>                            b.area * area_fraction *
>>>                            AREA(TRANSFORM(b.the_geom,32768)))) as  
>>> area
>>>   FROM regular_grid a, plant_area b
>>>   WHERE b.the_geom && a.the_geom AND
>>>                  DISTANCE(a.the_geom,b.the_geom) = 0
>>>   GROUP BY a.icell, a.jcell
>>>   ORDER BY a.icell, a.jcell;
>>>
>>> The TRANSFORM to 32768 gives me the area in square meters
>>> in a Lambert Conformal Conic projection that I routinely use in my
>>> work.  Both tables are indexed using GIST on the geometries.  I have
>>> run VACUUM ANALYZE.  I aborted the query on my Opteron
>>> machine after 24 hours of CPU time (it was the only thing running).
>>>
>>> I have not spent anytime trying to diagnose the query plan results
>>> as I am almost completely ignorant of how to use them -- and maybe
>>> this is the time to fix that gap in my knowledge.  Regardless, I  
>>> am holding
>>> out some hope that some expert on this list can provide some  
>>> advice on
>>> how better to render this query.
>>>
>>> Regards,
>>> terrakit
>>> _______________________________________________
>>> postgis-users mailing list
>>> postgis-users at postgis.refractions.net
>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>>
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users




More information about the postgis-users mailing list