<html><head><style type='text/css'>p { margin: 0; }</style></head><body><div style='font-family: Arial; font-size: 10pt; color: #000000'>Well, I don't *think* the lines layer is large, but it's hard for me to say. Here are some stats from the intersecting lines on the original query:<div><br></div><div>Number of lines: 28863</div><div><span style="font-size: 10pt; ">Sum of vertices/points = 296079 </span></div><div><span style="font-size: 10pt; ">Avg number of points per feature = 10.258</span></div><div><span style="font-size: 10pt; "><br></span></div><div><span style="font-size: 10pt; ">Is that considered large or small? </span></div><div><br></div><div>I'd be happy to pass on the streams layer if you're interested in taking a look at it. I've got a 40 MB zipped up shape file that I can send to you or place on a FTP site.<br><div><br></div><div>Shira</div><div><br><hr id="zwchr"><div style="color:#000;font-weight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt;"><b>From: </b>"Paul Ramsey" <pramsey@opengeo.org><br><b>To: </b>"PostGIS Users Discussion" <postgis-users@postgis.refractions.net><br><b>Sent: </b>Tuesday, September 18, 2012 11:25:41 AM<br><b>Subject: </b>Re: [postgis-users] How to speed up a spatial intersection where the intersecting polygon is drawn on-the-fly?<br><br>Well, I still feel that 3.5 seconds is too long for pulling and<br>intersecting 20000 features, *assuming* those features are not large<br>(not too many vertices) so if you confirm that the streams are not<br>massive objects I'm out of guesses. I'd be interested to stick your<br>data and query into a profiler and see if there's a spanner in the<br>gears somewhere.<br><br>P.<br><br>On Tue, Sep 18, 2012 at 10:01 AM, Shira Bezalel <shira@sfei.org> wrote:<br>> Hi Paul,<br>><br>> Thanks for the response. It's helpful to have the confirmation that this is<br>> probably about as fast as we're going to get for now.<br>><br>> The shared_buffers parameter is set to 3 GB (total machine RAM = 12 GB) so I<br>> think we're good on that front.<br>><br>> Yes, it's somewhat faster the second time I run the query. Observing the<br>> cache hit ratio, it definitely looks like it is having to pull records off<br>> physical disk the first time. But can we do anything about this? Isn't the<br>> first read always going to be from disk? And the second time from cache? Is<br>> there a way to bring the data into cache ahead of time?<br>><br>> Shira<br>><br>><br>><br>> ________________________________<br>> From: "Paul Ramsey" <pramsey@opengeo.org><br>> To: "PostGIS Users Discussion" <postgis-users@postgis.refractions.net><br>> Sent: Tuesday, September 18, 2012 5:47:36 AM<br>> Subject: Re: [postgis-users] How to speed up a spatial intersection where<br>> the intersecting polygon is drawn on-the-fly?<br>><br>><br>> There are few things left to tweak here...<br>> your streams are (presumably) relatively small features?<br>> your input polygon is very small and simple, so that's not the problem<br>> This is one of the few cases where I actually wonder about PostgreSQL<br>> tuning parameters: is your shared_buffered increased to a reasonable<br>> amount relative to your physical memory (50% of physical memory up to<br>> about 3GB)? If you're having to pull those records off physical disk,<br>> that could slow things down. Does the query run faster the second time<br>> you run it? (see how much caching effects change things)<br>> To the extent that your drawn polygon is non-square, chopping it into<br>> smaller objects will reduce the number of objects that are<br>> fetched-but-not-used. Otherwise, chopping it won't yield any big<br>> improvements, since it's already so simple.<br>> Sorry, not seeing much to tweak,<br>> P.<br>><br>> On Mon, Sep 17, 2012 at 5:57 PM, Shira Bezalel <shira@sfei.org> wrote:<br>>><br>>> I'm trying to determine if the response time we're seeing on a query is<br>>> reasonable or if there's anything we can do to speed it up. Looking for<br>>> advice and/or a reality check.<br>>><br>>> In general, we benefit from dicing our large polygon layers to speed up<br>>> intersection queries (a big thanks to Paul Ramsey for that pointer), but<br>>> the<br>>> catch with this particular query is that it is issued from a web-based GIS<br>>> application where a user draws a (potentially very large) polygon<br>>> on-the-fly<br>>> and then total length calculations are made for intersecting line features<br>>> within this dynamically drawn polygon. I've even tried dicing dynamically,<br>>> but it seems to only add more overall time.<br>>><br>>> Here is the SQL for one particular polygon I drew while testing:<br>>><br>>> SELECT s.legcode, s.strahler, sum ( st_length ( st_intersection (<br>>> the_geom,<br>>> st_transform ( ST_GeomFromText ( 'POLYGON((-13648936.363952<br>>> 4623692.0844833,-13611023.597928 4607181.686376,-13648324.867726<br>>> 4592505.7769473,-13590844.222464 4581498.8448758,-13665446.76206<br>>> 4572326.4014828,-13674007.709226 4607181.686376,-13653828.333762<br>>> 4591282.7844949,-13648936.363952 4623692.0844833))', 900913 ) , 3310 ) ) )<br>>> )<br>>> * 0.00062137119AS miles<br>>> FROM baari_streams s<br>>> WHERE ST_Intersects ( the_geom, st_transform ( ST_GeomFromText (<br>>> 'POLYGON((-13648936.363952 4623692.0844833,-13611023.597928<br>>> 4607181.686376,-13648324.867726 4592505.7769473,-13590844.222464<br>>> 4581498.8448758,-13665446.76206 4572326.4014828,-13674007.709226<br>>> 4607181.686376,-13653828.333762 4591282.7844949,-13648936.363952<br>>> 4623692.0844833))', 900913 ) , 3310 ))<br>>> GROUP BY s.legcode, s.strahler<br>>> ORDER BY s.legcode, s.strahler;<br>>><br>>><br>>><br>>> The explain analyze output:<br>>><br>>> http://explain.depesz.com/s/PNZ<br>>><br>>> The line table has 254833 records. It has a spatial index and the<br>>> optimizer<br>>> is using it. The index is clustered. And I have run vacuum analyze on it.<br>>><br>>> So it takes about 3.5 seconds to return 27 rows (an aggregation of 28863<br>>> rows). Is this about as good as can be expected? Ideally, we'd love to see<br>>> this return in about 1 second or less. Is that unreasonable?<br>>><br>>> Version info:<br>>> PostgreSQL 9.1.2 on x86_64-pc-linux-gnu, compiled by gcc-4.6.real<br>>> (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1, 64-bit<br>>> POSTGIS="1.5.3" GEOS="3.2.2-CAPI-1.6.2" PROJ="Rel. 4.7.1, 23 September<br>>> 2009"<br>>> LIBXML="2.7.8" USE_STATS<br>>><br>>> Thanks for any insight you can provide.<br>>><br>>> Shira<br>>><br>>> _______________________________________________<br>>> postgis-users mailing list<br>>> postgis-users@postgis.refractions.net<br>>> http://postgis.refractions.net/mailman/listinfo/postgis-users<br>>><br>> _______________________________________________<br>> postgis-users mailing list<br>> postgis-users@postgis.refractions.net<br>> http://postgis.refractions.net/mailman/listinfo/postgis-users<br>><br>><br>> _______________________________________________<br>> postgis-users mailing list<br>> postgis-users@postgis.refractions.net<br>> http://postgis.refractions.net/mailman/listinfo/postgis-users<br>><br>_______________________________________________<br>postgis-users mailing list<br>postgis-users@postgis.refractions.net<br>http://postgis.refractions.net/mailman/listinfo/postgis-users<br></div><br></div></div></div></body></html>