[postgis-users] Increase query performance

Paragon Corporation lr at pcorp.us
Tue Mar 8 13:02:22 PST 2011


Andreas,
 
Try ST_SimplifyPreserveTolerance.  The ST_Simplify often simplifies to
nothing or close.
 
Like I mentioned in the other post, it could be some huge geometries causing
your problems.  You don't necessarily want to simplify everything.
 
We usually do a conditional simplify like
 
CASE WHEN ST_NPoints(geom) > 3000 THEN ST_SimplyfyPreserveTopology(geom,0.1)
ELSE geom END
 
 
It probably wouldn't hurt to do a max check on your tables or a count to see
how many have more than n number of points.
 
SELECT MAX(ST_NPoints(geom)) As biggest, COUNT(CASE WHEN ST_NPoints(geom) >
3000 THEN 1 ELSE NULL END) as cnt_big_geoms
FROM yourtable
 
To get a sense of the largest geometry you are dealing with.
 
Hope that helps,
Regina
http://www.postgis.us

  _____  

From: Andreas Forø Tollefsen [mailto:andreasft at gmail.com] 
Sent: Tuesday, March 08, 2011 12:02 PM
To: PostGIS Users Discussion
Cc: Paragon Corporation
Subject: Re: [postgis-users] Increase query performance


After a suggestion from pgsql_performance i tried with ST_Simplify to speed
things up. 
However this gives me a:
NOTICE: ptarray_simplify returned a <2 pts array

Then server connection terminates.


Like this:  

SELECT ST_Intersection(priogrid_land.cell,
ST_Simplify(cshapeswdate.geom,0.1)) AS geom, 
priogrid_land.gid AS divider, gwcode, gwsyear, gweyear, startdate, enddate,
capname, caplong, caplat, col, row, xcoord, ycoord 
FROM priogrid_land, cshapeswdate WHERE ST_Intersects(priogrid_land.cell,
ST_Simplify(cshapeswdate.geom,0.1)) AND cshapeswdate.gwsyear <=1946 AND
cshapeswdate.gweyear >=1946 AND cshapeswdate.startdate <= '1946/1/1';


2011/3/8 Andreas Forø Tollefsen <andreasft at gmail.com>


Results from the EXPLAIN ANALYZE: 

"Nested Loop  (cost=0.00..1189.72 rows=3941 width=87790) (actual
time=7.091..2524830.264 rows=54145 loops=1)"
"  Join Filter: _st_intersects(priogrid_land.cell, cshapeswdate.geom)"
"  ->  Seq Scan on cshapeswdate  (cost=0.00..16.23 rows=22 width=87304)
(actual time=0.011..0.542 rows=72 loops=1)"
"        Filter: ((gwsyear <= 1946::numeric) AND (gweyear >= 1946::numeric)
AND (startdate <= '1946-01-01'::date))"
"  ->  Index Scan using idx_priogrid_land_cell on priogrid_land
(cost=0.00..8.29 rows=1 width=486) (actual time=3.026..30.152 rows=1338
loops=72)"
"        Index Cond: (priogrid_land.cell && cshapeswdate.geom)"
"Total runtime: 2524889.630 ms"


2011/3/8 Andreas Forø Tollefsen <andreasft at gmail.com> 


Hi, 
This query takes about 41 minutes per year. Doing this for every year from
1946 to 2008 takes a lot of time.
The grid consists of 64818 cell polygons, while the country table has about
210 polygons.

I will add indexes on year and date and try again. An do an explain analyze.

Cheers,
Andreas


2011/3/8 Paragon Corporation <lr at pcorp.us>


Andreas,
I don't see anything glaringly wrong with your query, but hard to tell
without seeing an explain plan or what you mean by takes a lot of time.  Is
a lot of time minutes, hours, days?
How many records are we talking about here?  What's the max number of points
you have in any geometry.  Often times its just one huge mega geometry with
a 100,000 points or more slowing everything down.
 
Do you have indexes on your year fields and date columns?
 
Hope that helps,
Regina 
http://www.postgis.us
 

  _____  

From: postgis-users-bounces at postgis.refractions.net
[mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of Andreas
Forø Tollefsen
Sent: Tuesday, March 08, 2011 4:15 AM
To: PostGIS Users Discussion
Subject: [postgis-users] Increase query performance


Hi all, 

Another question on postgis query performance. I did some discussion on the
pgsql performance list on how i could increase the performance on my server.
I did manage to increase the transactions per second, but came to the
conclusion that this did not help the ST_Intersection query which i was
trying to speed up.

Any suggestions on how to speed up this query? Basically i want to create an
intersection between a vector grid and country shapefiles. Then calculate
the area of all the polygons in the intersected table, and finally selecting
the country code for each cell which represents the largest area within each
cell. I does work as it is, but i would like to increase the speed. The
query which takes a lot of time is the ST_Intersection.

"PostgreSQL 8.4.7 on i686-pc-linux-gnu, compiled by GCC gcc-4.4.real
(Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5, 32-bit"

"POSTGIS="2.0.0SVN" GEOS="3.2.2-CAPI-1.6.2" PROJ="Rel. 4.7.1, 23 September
2009" LIBXML="2.7.7" USE_STATS"


DROP TABLE IF EXISTS cshapesgrid1946;

SELECT ST_Intersection(priogrid_land.cell, cshapeswdate.geom) AS geom,
priogrid_land.gid AS divider, gwcode, gwsyear, gweyear, startdate, enddate,
capname, caplong, caplat, col, row, xcoord, ycoord INTO cshapesgrid1946 FROM
priogrid_land, cshapeswdate WHERE ST_Intersects(priogrid_land.cell,
cshapeswdate.geom) AND cshapeswdate.gwsyear <=1946 AND cshapeswdate.gweyear
>=1946 AND cshapeswdate.startdate <= '1946/1/1';

ALTER TABLE cshapesgrid1946 ADD COLUMN area float;

UPDATE cshapesgrid1946 SET area = ST_Area(cshapesgrid1946.geom);

DROP TABLE IF EXISTS pg1946;

SELECT * INTO pg1946 FROM cshapesgrid1946 a WHERE a.area IN (SELECT
MAX(area) FROM cshapesgrid1946 b GROUP BY divider);

CREATE INDEX idx_pg1946 ON pg1946 USING gist(geom);


Best,
Andreas

_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users






-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20110308/7e583353/attachment.html>


More information about the postgis-users mailing list