<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content="text/html; charset=iso-8859-1" http-equiv=Content-Type>
<META name=GENERATOR content="MSHTML 8.00.7601.17537"></HEAD>
<BODY>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=679035720-08032011>Andreas,</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=679035720-08032011></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=679035720-08032011>Try ST_SimplifyPreserveTolerance. The ST_Simplify
often simplifies to nothing or close.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=679035720-08032011></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=679035720-08032011>Like I mentioned in the other post, it could be some
huge geometries causing your problems. You don't necessarily want to
simplify everything.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=679035720-08032011></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=679035720-08032011>We usually do a conditional simplify
like</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=679035720-08032011></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=679035720-08032011>CASE WHEN ST_NPoints(geom) > 3000 THEN
ST_SimplyfyPreserveTopology(geom,0.1) ELSE geom END</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=679035720-08032011></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=679035720-08032011></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=679035720-08032011>It probably wouldn't hurt to do a max check on your
tables or a count to see how many have more than n number of
points.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=679035720-08032011></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=679035720-08032011>SELECT MAX(ST_NPoints(geom)) As biggest, COUNT(CASE
WHEN ST_NPoints(geom) > 3000 THEN 1 ELSE NULL END) as
cnt_big_geoms</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=679035720-08032011>FROM yourtable</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=679035720-08032011></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=679035720-08032011>To get a sense of the largest geometry you are dealing
with.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=679035720-08032011></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=679035720-08032011>Hope that helps,</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=679035720-08032011>Regina</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Arial><SPAN
class=679035720-08032011><A
href="http://www.postgis.us">http://www.postgis.us</A></SPAN></FONT></DIV><BR>
<DIV dir=ltr lang=en-us class=OutlookMessageHeader align=left>
<HR tabIndex=-1>
<FONT size=2 face=Tahoma><B>From:</B> Andreas Forø Tollefsen
[mailto:andreasft@gmail.com] <BR><B>Sent:</B> Tuesday, March 08, 2011 12:02
PM<BR><B>To:</B> PostGIS Users Discussion<BR><B>Cc:</B> Paragon
Corporation<BR><B>Subject:</B> Re: [postgis-users] Increase query
performance<BR></FONT><BR></DIV>
<DIV></DIV>After a suggestion from pgsql_performance i tried with ST_Simplify to
speed things up.
<DIV>However this gives me a:</DIV>
<DIV>NOTICE: ptarray_simplify returned a <2 pts array</DIV>
<DIV><BR></DIV>
<DIV>Then server connection terminates.<BR>
<DIV><BR></DIV>
<DIV>Like this:
<DIV><BR></DIV>
<DIV>SELECT ST_Intersection(priogrid_land.cell,
ST_Simplify(cshapeswdate.geom,0.1)) AS geom, </DIV>
<DIV>priogrid_land.gid AS divider, gwcode, gwsyear, gweyear, startdate, enddate,
capname, caplong, caplat, col, row, xcoord, ycoord </DIV>
<DIV>FROM priogrid_land, cshapeswdate WHERE ST_Intersects(priogrid_land.cell,
ST_Simplify(cshapeswdate.geom,0.1)) AND cshapeswdate.gwsyear <=1946 AND
cshapeswdate.gweyear >=1946 AND cshapeswdate.startdate <=
'1946/1/1';</DIV>
<DIV><BR></DIV><BR>
<DIV class=gmail_quote>2011/3/8 Andreas Forø Tollefsen <SPAN dir=ltr><<A
href="mailto:andreasft@gmail.com">andreasft@gmail.com</A>></SPAN><BR>
<BLOCKQUOTE
style="BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex; PADDING-LEFT: 1ex"
class=gmail_quote>Results from the EXPLAIN ANALYZE:
<DIV><BR></DIV>
<DIV>
<DIV>"Nested Loop (cost=0.00..1189.72 rows=3941 width=87790) (actual
time=7.091..2524830.264 rows=54145 loops=1)"</DIV>
<DIV>" Join Filter: _st_intersects(priogrid_land.cell,
cshapeswdate.geom)"</DIV>
<DIV>" -> Seq Scan on cshapeswdate (cost=0.00..16.23
rows=22 width=87304) (actual time=0.011..0.542 rows=72 loops=1)"</DIV>
<DIV>" Filter: ((gwsyear <= 1946::numeric) AND
(gweyear >= 1946::numeric) AND (startdate <= '1946-01-01'::date))"</DIV>
<DIV>" -> Index Scan using idx_priogrid_land_cell on
priogrid_land (cost=0.00..8.29 rows=1 width=486) (actual
time=3.026..30.152 rows=1338 loops=72)"</DIV>
<DIV>" Index Cond: (priogrid_land.cell &&
cshapeswdate.geom)"</DIV>
<DIV>"Total runtime: 2524889.630 ms"</DIV>
<DIV><BR></DIV><BR>
<DIV class=gmail_quote>2011/3/8 Andreas Forø Tollefsen <SPAN dir=ltr><<A
href="mailto:andreasft@gmail.com"
target=_blank>andreasft@gmail.com</A>></SPAN>
<DIV>
<DIV></DIV>
<DIV class=h5><BR>
<BLOCKQUOTE
style="BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex; PADDING-LEFT: 1ex"
class=gmail_quote>Hi,
<DIV>This query takes about 41 minutes per year. Doing this for every year
from 1946 to 2008 takes a lot of time.</DIV>
<DIV>The grid consists of 64818 cell polygons, while the country table has
about 210 polygons.</DIV>
<DIV><BR></DIV>
<DIV>I will add indexes on year and date and try again. An do an explain
analyze.</DIV>
<DIV><BR></DIV>
<DIV>Cheers,</DIV>
<DIV>Andreas<BR><BR>
<DIV class=gmail_quote>2011/3/8 Paragon Corporation <SPAN dir=ltr><<A
href="mailto:lr@pcorp.us" target=_blank>lr@pcorp.us</A>></SPAN><BR>
<BLOCKQUOTE
style="BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex; PADDING-LEFT: 1ex"
class=gmail_quote>
<DIV>
<DIV></DIV>
<DIV>
<DIV>
<DIV dir=ltr align=left><SPAN><FONT color=#0000ff size=2
face=Arial>Andreas,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN><FONT color=#0000ff size=2 face=Arial>I
don't see anything glaringly wrong with your query, but hard to tell
without seeing an explain plan or what you mean by takes a lot of
time. Is a lot of time minutes, hours, days?</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN><FONT color=#0000ff size=2 face=Arial>How
many records are we talking about here? What's the max number of
points you have in any geometry. Often times its just one huge mega
geometry with a 100,000 points or more slowing everything
down.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN><FONT color=#0000ff size=2
face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN><FONT color=#0000ff size=2 face=Arial>Do you
have indexes on your year fields and date columns?</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN><FONT color=#0000ff size=2
face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN><FONT color=#0000ff size=2 face=Arial>Hope
that helps,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN><FONT color=#0000ff size=2 face=Arial>Regina
</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN><FONT color=#0000ff size=2 face=Arial><A
href="http://www.postgis.us"
target=_blank>http://www.postgis.us</A></FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN><FONT color=#0000ff size=2
face=Arial></FONT></SPAN> </DIV><BR>
<DIV dir=ltr lang=en-us align=left>
<HR>
<FONT size=2 face=Tahoma><B>From:</B> <A
href="mailto:postgis-users-bounces@postgis.refractions.net"
target=_blank>postgis-users-bounces@postgis.refractions.net</A> [mailto:<A
href="mailto:postgis-users-bounces@postgis.refractions.net"
target=_blank>postgis-users-bounces@postgis.refractions.net</A>] <B>On
Behalf Of </B>Andreas Forø Tollefsen<BR><B>Sent:</B> Tuesday, March 08,
2011 4:15 AM<BR><B>To:</B> PostGIS Users Discussion<BR><B>Subject:</B>
[postgis-users] Increase query performance<BR></FONT><BR></DIV>
<DIV>
<DIV></DIV>
<DIV>
<DIV></DIV>Hi all,
<DIV><BR></DIV>
<DIV>Another question on postgis query performance. I did some discussion
on the pgsql performance list on how i could increase the performance on
my server.</DIV>
<DIV>I did manage to increase the transactions per second, but came to the
conclusion that this did not help the ST_Intersection query which i was
trying to speed up.</DIV>
<DIV><BR></DIV>
<DIV>Any suggestions on how to speed up this query? Basically i want to
create an intersection between a vector grid and country shapefiles. Then
calculate the area of all the polygons in the intersected table, and
finally selecting the country code for each cell which represents the
largest area within each cell. I does work as it is, but i would like to
increase the speed. The query which takes a lot of time is the
ST_Intersection.</DIV>
<DIV><BR></DIV>
<DIV>"PostgreSQL 8.4.7 on i686-pc-linux-gnu, compiled by GCC gcc-4.4.real
(Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5, 32-bit"</DIV>
<DIV><BR></DIV>
<DIV>"POSTGIS="2.0.0SVN" GEOS="3.2.2-CAPI-1.6.2" PROJ="Rel. 4.7.1, 23
September 2009" LIBXML="2.7.7" USE_STATS"</DIV>
<DIV><BR></DIV>
<DIV><BR></DIV>
<DIV>
<DIV>DROP TABLE IF EXISTS cshapesgrid1946;</DIV>
<DIV><BR></DIV>
<DIV>SELECT ST_Intersection(priogrid_land.cell, cshapeswdate.geom) AS
geom, priogrid_land.gid AS divider, gwcode, gwsyear, gweyear, startdate,
enddate, capname, caplong, caplat, col, row, xcoord, ycoord INTO
cshapesgrid1946 FROM priogrid_land, cshapeswdate WHERE
ST_Intersects(priogrid_land.cell, cshapeswdate.geom) AND
cshapeswdate.gwsyear <=1946 AND cshapeswdate.gweyear >=1946 AND
cshapeswdate.startdate <= '1946/1/1';</DIV>
<DIV><BR></DIV>
<DIV>ALTER TABLE cshapesgrid1946 ADD COLUMN area float;</DIV>
<DIV><BR></DIV>
<DIV>UPDATE cshapesgrid1946 SET area =
ST_Area(cshapesgrid1946.geom);</DIV>
<DIV><BR></DIV>
<DIV>DROP TABLE IF EXISTS pg1946;</DIV>
<DIV><BR></DIV>
<DIV>SELECT * INTO pg1946 FROM cshapesgrid1946 a WHERE a.area IN (SELECT
MAX(area) FROM cshapesgrid1946 b GROUP BY divider);</DIV>
<DIV><BR></DIV>
<DIV>CREATE INDEX idx_pg1946 ON pg1946 USING gist(geom);</DIV></DIV>
<DIV><BR></DIV>
<DIV><BR></DIV>
<DIV>Best,</DIV>
<DIV>Andreas</DIV></DIV></DIV></DIV><BR></DIV></DIV>_______________________________________________<BR>postgis-users
mailing list<BR><A href="mailto:postgis-users@postgis.refractions.net"
target=_blank>postgis-users@postgis.refractions.net</A><BR><A
href="http://postgis.refractions.net/mailman/listinfo/postgis-users"
target=_blank>http://postgis.refractions.net/mailman/listinfo/postgis-users</A><BR><BR></BLOCKQUOTE></DIV><BR></DIV></BLOCKQUOTE></DIV></DIV></DIV><BR></DIV></BLOCKQUOTE></DIV><BR></DIV></DIV></BODY></HTML>