[postgis-users] query optimization
Dylan Keon
keon at nacse.org
Tue Dec 14 15:57:35 PST 2004
Paul,
Thanks for the quick (and thorough) reply. I forgot to mention that the
postgresql data files are already being stored on a RAID 5 array of fast
SCSI disks. My guess was it couldn't get much better, but I thought
maybe one of you postgis wizards had a silver bullet :-)
How about optimizing postgresql.conf? Perhaps tweak sort_mem to
allocate a bit more memory to the process? I haven't changed much in
postgresql.conf since installing.
Thanks,
Dylan
On 12/14/2004 03:35 PM, Paul Ramsey wrote:
> Dylan,
>
> Well, you could write query two using the extent() function instead,
> which would prevent you from doing four accessors on the geometry for
> each row. But that will probably not save all that much time.
>
> In query one, the estimator thinks you will get 983445 rows and you
> actually got 983496. That's a great estimate! And because you are
> pulling so much of the table (almost a million rows) it sequence scans
> it. You pull that much data, it will take time to get it.
>
> In query two, you literally are sequence scanning the whole table. You
> have to traverse 13M rows, that takes time.
>
> At this point you have run out of software fixes, your only recourse is
> to better hardware. Faster disk access will speed up your queries. Put
> your database on a multi-spindle RAID array (RAID 10 or RAID 5, with as
> many drives as possible, spinning as fast as possible). You are scanning
> the whole table, so you are going to be disk bound.
>
> Paul
>
> Dylan Keon wrote:
>
>> Hi list,
>>
>> I'm trying to optimize queries on a PostGIS-enabled database. PostGIS
>> 0.8.2 and PostgreSQL 7.4.5 are on a Dell 2650 dual 2.4GHz Xeon with
>> 2GB RAM, running RHEL.
>>
>> Most of the tables storing geometries contain less than 1 million
>> rows, and queries on those tables are quite fast. But one table
>> contains over 13 million rows, and queries take longer than I would
>> like. I've run SELECT update_geometry_stats() and VACUUM ANALYZE, and
>> built a GiST index on the geometry column. I have not built any other
>> indexes since this is a simple table that will mostly be subjected to
>> bounding box queries.
>>
>> Any suggestions on how I can further optimize? Or do I have it as
>> good as it's gonna get? The first query below is typical of most of
>> the queries that will be done.
>>
>> Thanks!
>> --Dylan
>>
>>
>> tsunami-# explain analyze
>> tsunami-# select depth from owase_50m where geom && 'BOX3D(136.2
>> 33.8,136.8 34.2)'::box3d;
>> Seq Scan on owase_50m (cost=0.00..478944.89 rows=983445 width=4)
>> (actual time=11813.053..56433.021 rows=983496 loops=1)
>> Filter: (geom && 'SRID=-1;BOX3D(136.2 33.8 0,136.8 34.2 0)'::geometry)
>> Total runtime: 57148.437 ms
>>
>>
>> tsunami=# explain analyze
>> tsunami-# select min(X(geom)) as minx, min(Y(geom)) as miny,
>> max(X(geom)) as maxx, max(Y(geom)) as maxy from owase_50m;
>> Aggregate (cost=579406.43..579406.44 rows=1 width=140) (actual
>> time=107140.546..107140.546 rows=1 loops=1)
>> -> Seq Scan on owase_50m (cost=0.00..445457.71 rows=13394871
>> width=140) (actual time=4.876..62661.358 rows=13394871 loops=1)
>> Total runtime: 107178.251 ms
>>
>>
>> tsunami=# \d owase_50m
>> Column | Type | Modifiers
>> --------+----------+-----------
>> id | integer |
>> x | integer |
>> y | integer |
>> depth | integer |
>> geom | geometry |
>> Indexes:
>> "owase_50m_index" gist (geom)
>> Check constraints:
>> "$2" CHECK (geometrytype(geom) = 'POINT'::text OR geom IS NULL)
>> "$1" CHECK (srid(geom) = -1)
More information about the postgis-users
mailing list