[postgis-users] query tuning

Paragon Corporation lr at pcorp.us
Thu Jul 31 15:27:59 PDT 2008


Paul is right that clustering does an exclusive lock on the table.  Its my
understanding though that a partially clustered table is better than a
completely random ordered table.  The reason being that when the scanner is
picking up matches it does it by page and the more matches it can find on
each page, the fewer pages it has to load up.

Also as I recall, I think each successive cluster is less taxing if you set
your fill factors to lower than 100 as it will try to use the free space to
move things in and since the table is partially clustered already, there is
less work that needs to be done.  (I could be very wrong on the last
assumption though).  There are also talks about revising the clustering
strategy to be more in line with SQL Server (SQL Server maintains cluster on
updates).  Although I suppose we won't see this until probably 8.5 (possibly
8.4).

Regina 

-----Original Message-----
From: postgis-users-bounces at postgis.refractions.net
[mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of Paul
Ramsey
Sent: Thursday, July 31, 2008 5:13 PM
To: PostGIS Users Discussion
Subject: Re: [postgis-users] query tuning

Correct, clustering locks up the table while in process, so it's a pain for
changing data, also the new data doesn't end up clustered, just tacked onto
the end.

BTW, your shared buffers of 100MB for a 16GB box is stingy. Give it 2-4GB,
be generous.

Not sure why you're having performance issues though... seems like simple
stuff.

P

On Thu, Jul 31, 2008 at 1:59 PM, marc2112 <postgis at mbreslow.net> wrote:
> It's a pretty diesel box - 2 quad core xeons, 16 GB RAM, dedicated 
> disks for data and txnlog.
>
> It's my understanding that a clustered index only helps when the data 
> isn't really changing.  I read that it clusters (basically sorts the 
> data per the
> index) when you create the index but subsequent inserts/updates still 
> get written on new pages.  Am I off there?
>
> On Thu, Jul 31, 2008 at 4:46 PM, Paragon Corporation <lr at pcorp.us> wrote:
>>
>> It would probably help if you clustered on the gist index.
>>
>> Do
>>
>> ALTER TABLE users CLUSTER ON users_locbbox; CLUSTER users;
>>
>> Also what kind of processors and on board ram do you have?
>>
>> Hope that helps,
>> Regina
>> ________________________________
>> From: postgis-users-bounces at postgis.refractions.net
>> [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of 
>> marc2112
>> Sent: Thursday, July 31, 2008 3:31 PM
>> To: postgis-users at postgis.refractions.net
>> Subject: [postgis-users] query tuning
>>
>> Hi Folks,
>>
>> I've got a slow running query.  When I break it down to just the 
>> spatial constraint on one table, I find that the query takes 8s.  I 
>> need my broader query to take < 1s so have a long way to go on just the
spatial part.
>> Hopefully you guys will have some ideas for me...
>>
>> Here's what I've got:
>>
>> Query:
>> SELECT u.username,u.locbbox, y(u.locpoint) AS latitude, x(u.locpoint) 
>> AS longitude FROM users u WHERE u.locbbox && SetSrid( 
>> 'BOX(-73.795166015625
>> 9.42738628387451,-69.49951171875 11.813588142395)'::box2d, 4326 )
>>
>> Explain Analyze:
>> "Bitmap Heap Scan on users u  (cost=741.34..43666.77 rows=15789 
>> width=486) (actual time=29.730..7321.358 rows=10687 loops=1)"
>> "  Filter: (locbbox &&
>>
'0103000020E6100000010000000500000000000000E47252C000000060D2DA224000000000E
47252C0000000A08EA0274000000000F85F51C0000000A08EA0274000000000F85F51C000000
060D2DA224000000000E47252C000000060D2DA2240'::geometry)"
>> "  ->  Bitmap Index Scan on users_locbbox  (cost=0.00..737.39 
>> rows=15789
>> width=0) (actual time=26.981..26.981 rows=10703 loops=1)"
>> "        Index Cond: (locbbox &&
>>
'0103000020E6100000010000000500000000000000E47252C000000060D2DA224000000000E
47252C0000000A08EA0274000000000F85F51C0000000A08EA0274000000000F85F51C000000
060D2DA224000000000E47252C000000060D2DA2240'::geometry)"
>> "Total runtime: 7328.359 ms"
>>
>>
>> Index:
>> CREATE INDEX users_locbbox
>>   ON users
>>   USING gist
>>   (locbbox);
>>
>> Some Stats (let me know what else would be helpful):
>> Rows in users table: 1.85mm
>> Table Size: ~4GB
>> users_locbbox index size: 190MB
>>
>> Tuning done:
>> shared_buffers = 100MB
>> work_mem = 100MB
>> max_fsm_pages = 153600
>> random_page_cost = 3.0
>> cpu_tuple_cost = 0.1
>> effective_cache_size = 8GB
>> default_statistics_target = 100
>>
>> Tables in the query are vacuum/analyzed daily.
>>
>>
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
>
_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users





More information about the postgis-users mailing list