[postgis-users] query optimization

Paul Ramsey pramsey at refractions.net
Tue Dec 14 16:08:37 PST 2004


Dylan Keon wrote:

> Paul,
> 
> Thanks for the quick (and thorough) reply.  I forgot to mention that the 
> postgresql data files are already being stored on a RAID 5 array of fast 
> SCSI disks.  My guess was it couldn't get much better, but I thought 
> maybe one of you postgis wizards had a silver bullet :-)
> 
> How about optimizing postgresql.conf?  Perhaps tweak sort_mem to 
> allocate a bit more memory to the process?  I haven't changed much in 
> postgresql.conf since installing.

Upping shared buffers and effective cache size is, as well as lowering 
the random page cost, are all good things when you have decent hardware. 
It will not help you though, you are just plain hitting your hardware. 
Compare running the same query twice. The first time should be slower, 
as the OS caches portions of your pages of interest for the second time. 
Tuning your configuration will help you later when you do more complex 
queries, as the planner will have better information about the 
capabilities of your hardware.

Paul

> Thanks,
> Dylan
> 
> 
> On 12/14/2004 03:35 PM, Paul Ramsey wrote:
> 
>> Dylan,
>>
>> Well, you could write query two using the extent() function instead, 
>> which would prevent you from doing four accessors on the geometry for 
>> each row. But that will probably not save all that much time.
>>
>> In query one, the estimator thinks you will get 983445 rows and you 
>> actually got 983496. That's a great estimate! And because you are 
>> pulling so much of the table (almost a million rows) it sequence scans 
>> it. You pull that much data, it will take time to get it.
>>
>> In query two, you literally are sequence scanning the whole table. You 
>> have to traverse 13M rows, that takes time.
>>
>> At this point you have run out of software fixes, your only recourse 
>> is to better hardware. Faster disk access will speed up your queries. 
>> Put your database on a multi-spindle RAID array (RAID 10 or RAID 5, 
>> with as many drives as possible, spinning as fast as possible). You 
>> are scanning the whole table, so you are going to be disk bound.
>>
>> Paul
>>
>> Dylan Keon wrote:
>>
>>> Hi list,
>>>
>>> I'm trying to optimize queries on a PostGIS-enabled database.  
>>> PostGIS 0.8.2 and PostgreSQL 7.4.5 are on a Dell 2650 dual 2.4GHz 
>>> Xeon with 2GB RAM, running RHEL.
>>>
>>> Most of the tables storing geometries contain less than 1 million 
>>> rows, and queries on those tables are quite fast.  But one table 
>>> contains over 13 million rows, and queries take longer than I would 
>>> like.  I've run SELECT update_geometry_stats() and VACUUM ANALYZE, 
>>> and built a GiST index on the geometry column.  I have not built any 
>>> other indexes since this is a simple table that will mostly be 
>>> subjected to bounding box queries.
>>>
>>> Any suggestions on how I can further optimize?  Or do I have it as 
>>> good as it's gonna get?  The first query below is typical of most of 
>>> the queries that will be done.
>>>
>>> Thanks!
>>> --Dylan
>>>
>>>
>>> tsunami-# explain analyze
>>> tsunami-# select depth from owase_50m where geom && 'BOX3D(136.2 
>>> 33.8,136.8 34.2)'::box3d;
>>>  Seq Scan on owase_50m  (cost=0.00..478944.89 rows=983445 width=4) 
>>> (actual time=11813.053..56433.021 rows=983496 loops=1)
>>>    Filter: (geom && 'SRID=-1;BOX3D(136.2 33.8 0,136.8 34.2 
>>> 0)'::geometry)
>>>  Total runtime: 57148.437 ms
>>>
>>>
>>> tsunami=# explain analyze
>>> tsunami-# select min(X(geom)) as minx, min(Y(geom)) as miny, 
>>> max(X(geom)) as maxx, max(Y(geom)) as maxy from owase_50m;
>>>  Aggregate  (cost=579406.43..579406.44 rows=1 width=140) (actual 
>>> time=107140.546..107140.546 rows=1 loops=1)
>>>    ->  Seq Scan on owase_50m  (cost=0.00..445457.71 rows=13394871 
>>> width=140) (actual time=4.876..62661.358 rows=13394871 loops=1)
>>>  Total runtime: 107178.251 ms
>>>
>>>
>>> tsunami=# \d owase_50m
>>>  Column |   Type   | Modifiers
>>> --------+----------+-----------
>>>  id     | integer  |
>>>  x      | integer  |
>>>  y      | integer  |
>>>  depth  | integer  |
>>>  geom   | geometry |
>>> Indexes:
>>>     "owase_50m_index" gist (geom)
>>> Check constraints:
>>>     "$2" CHECK (geometrytype(geom) = 'POINT'::text OR geom IS NULL)
>>>     "$1" CHECK (srid(geom) = -1)
> 
> 
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users




More information about the postgis-users mailing list