[postgis-users] How to speed up a spatial intersection where the intersecting polygon is drawn on-the-fly?
Paul Ramsey
pramsey at opengeo.org
Tue Sep 18 11:25:41 PDT 2012
Well, I still feel that 3.5 seconds is too long for pulling and
intersecting 20000 features, *assuming* those features are not large
(not too many vertices) so if you confirm that the streams are not
massive objects I'm out of guesses. I'd be interested to stick your
data and query into a profiler and see if there's a spanner in the
gears somewhere.
P.
On Tue, Sep 18, 2012 at 10:01 AM, Shira Bezalel <shira at sfei.org> wrote:
> Hi Paul,
>
> Thanks for the response. It's helpful to have the confirmation that this is
> probably about as fast as we're going to get for now.
>
> The shared_buffers parameter is set to 3 GB (total machine RAM = 12 GB) so I
> think we're good on that front.
>
> Yes, it's somewhat faster the second time I run the query. Observing the
> cache hit ratio, it definitely looks like it is having to pull records off
> physical disk the first time. But can we do anything about this? Isn't the
> first read always going to be from disk? And the second time from cache? Is
> there a way to bring the data into cache ahead of time?
>
> Shira
>
>
>
> ________________________________
> From: "Paul Ramsey" <pramsey at opengeo.org>
> To: "PostGIS Users Discussion" <postgis-users at postgis.refractions.net>
> Sent: Tuesday, September 18, 2012 5:47:36 AM
> Subject: Re: [postgis-users] How to speed up a spatial intersection where
> the intersecting polygon is drawn on-the-fly?
>
>
> There are few things left to tweak here...
> your streams are (presumably) relatively small features?
> your input polygon is very small and simple, so that's not the problem
> This is one of the few cases where I actually wonder about PostgreSQL
> tuning parameters: is your shared_buffered increased to a reasonable
> amount relative to your physical memory (50% of physical memory up to
> about 3GB)? If you're having to pull those records off physical disk,
> that could slow things down. Does the query run faster the second time
> you run it? (see how much caching effects change things)
> To the extent that your drawn polygon is non-square, chopping it into
> smaller objects will reduce the number of objects that are
> fetched-but-not-used. Otherwise, chopping it won't yield any big
> improvements, since it's already so simple.
> Sorry, not seeing much to tweak,
> P.
>
> On Mon, Sep 17, 2012 at 5:57 PM, Shira Bezalel <shira at sfei.org> wrote:
>>
>> I'm trying to determine if the response time we're seeing on a query is
>> reasonable or if there's anything we can do to speed it up. Looking for
>> advice and/or a reality check.
>>
>> In general, we benefit from dicing our large polygon layers to speed up
>> intersection queries (a big thanks to Paul Ramsey for that pointer), but
>> the
>> catch with this particular query is that it is issued from a web-based GIS
>> application where a user draws a (potentially very large) polygon
>> on-the-fly
>> and then total length calculations are made for intersecting line features
>> within this dynamically drawn polygon. I've even tried dicing dynamically,
>> but it seems to only add more overall time.
>>
>> Here is the SQL for one particular polygon I drew while testing:
>>
>> SELECT s.legcode, s.strahler, sum ( st_length ( st_intersection (
>> the_geom,
>> st_transform ( ST_GeomFromText ( 'POLYGON((-13648936.363952
>> 4623692.0844833,-13611023.597928 4607181.686376,-13648324.867726
>> 4592505.7769473,-13590844.222464 4581498.8448758,-13665446.76206
>> 4572326.4014828,-13674007.709226 4607181.686376,-13653828.333762
>> 4591282.7844949,-13648936.363952 4623692.0844833))', 900913 ) , 3310 ) ) )
>> )
>> * 0.00062137119AS miles
>> FROM baari_streams s
>> WHERE ST_Intersects ( the_geom, st_transform ( ST_GeomFromText (
>> 'POLYGON((-13648936.363952 4623692.0844833,-13611023.597928
>> 4607181.686376,-13648324.867726 4592505.7769473,-13590844.222464
>> 4581498.8448758,-13665446.76206 4572326.4014828,-13674007.709226
>> 4607181.686376,-13653828.333762 4591282.7844949,-13648936.363952
>> 4623692.0844833))', 900913 ) , 3310 ))
>> GROUP BY s.legcode, s.strahler
>> ORDER BY s.legcode, s.strahler;
>>
>>
>>
>> The explain analyze output:
>>
>> http://explain.depesz.com/s/PNZ
>>
>> The line table has 254833 records. It has a spatial index and the
>> optimizer
>> is using it. The index is clustered. And I have run vacuum analyze on it.
>>
>> So it takes about 3.5 seconds to return 27 rows (an aggregation of 28863
>> rows). Is this about as good as can be expected? Ideally, we'd love to see
>> this return in about 1 second or less. Is that unreasonable?
>>
>> Version info:
>> PostgreSQL 9.1.2 on x86_64-pc-linux-gnu, compiled by gcc-4.6.real
>> (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1, 64-bit
>> POSTGIS="1.5.3" GEOS="3.2.2-CAPI-1.6.2" PROJ="Rel. 4.7.1, 23 September
>> 2009"
>> LIBXML="2.7.8" USE_STATS
>>
>> Thanks for any insight you can provide.
>>
>> Shira
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
More information about the postgis-users
mailing list