[postgis-users] please help me optimize my mapserver query

Wed Dec 15 12:07:11 PST 2004

By focusing on the well_log_curve table as my main problem I was able to solve the issue. Basically this is what I did to kick start the index on the source column in well_log_curve.

I dropped all indexes on well_log_curve, vacuum analyze the entire db, recreated the indexes on well_log_curve, vacuum analyze well_log_curve and then voila...like magic the index starts working again.

idc=# explain analyze select * from well_log_curve2 where source = 'SHD477';
                                                                       QUERY PLAN                                             
--------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using well_log_curve2_source_idx on well_log_curve2  (cost=0.00..14639.63 rows=3761 width=34) (actual time=34.87..479.00 rows=1058 loops=1)
   Index Cond: (source = 'SHD477'::character varying)
 Total runtime: 479.78 msec
(3 rows)

Thanks to all those with suggestions on my problem.

Zamil

-----Original Message-----
From: postgis-users-bounces at postgis.refractions.net
[mailto:postgis-users-bounces at postgis.refractions.net]On Behalf Of Zamil
Murji
Sent: Wednesday, December 15, 2004 9:41 AM
To: PostGIS Users Discussion
Cc: MAPSERVER-USERS at LISTS.UMN.EDU
Subject: RE: [postgis-users] please help me optimize my mapserver query

Just another side note.

\d well_log_curve2
       Table "public.well_log_curve2"
 Column |         Type          | Modifiers
--------+-----------------------+-----------
 uwi    | character varying(20) |
 source | character varying(12) |
Indexes: well_log_curve2_source_idx btree (source),
         well_log_curve2_uwi_idx btree (uwi)

Notice that there is a index built on source column, however, if i run:
select uwi from well_log_curve2 where source = 'TIG600'

the query plan looks like so:
                                                         QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
 Seq Scan on well_log_curve2  (cost=0.00..117418.86 rows=234294 width=19) (actual time=1907.28..13906.98 rows=1058 loops=1)
   Filter: (source = 'SHD477'::character varying)
 Total runtime: 13907.88 msec
(3 rows)

It doesn't even use the index. Could this be the reason the other query is acting slow; because pg seems to ignore the source index.  I have tried vacuum analyzing the table and I've tried dropping and recreating the index. Both of these steps don't work. 

Please help...
Zamil

-----Original Message-----
From: postgis-users-bounces at postgis.refractions.net
[mailto:postgis-users-bounces at postgis.refractions.net]On Behalf Of Zamil
Murji
Sent: Wednesday, December 15, 2004 9:31 AM
To: PostGIS Users Discussion
Cc: MAPSERVER-USERS at LISTS.UMN.EDU
Subject: [postgis-users] please help me optimize my mapserver query

Hello list,

I have a mapserver layer that is extremely slow. After yesterday's data update, this layer is taking forever to draw.

Here is the query that is generated:
SELECT uwi::text,asbinary(force_collection(force_2d(the_geom)),'NDR'),oid::text from (select  main_well2.the_geom as the_geom, main_well2.oid as oid, main_well2.uwi as uwi,  well_log_curve2.source as source from main_well2 inner join well_log_curve2 on main_well2.uwi =  well_log_curve2.uwi) as new_table WHERE (source = 'TIG600') and (the_geom && setSRID(  'BOX3D(-142.633293514961 38.5465516731216,-51.4598436170248  76.5484665317199)'::BOX3D,find_srid('','main_well2','the_geom') ))

The key part to this query is the FILTER tag (source = 'TIG600'). well_log_curve2 has approximately 5.6 million rows. main_well2 has approximately 543000 rows. Before the update, the query plan use to filter the well_log_curve table by source before joining to main_well2 and then apply the gist index. Now it is filtering by source at the very end. This is significant because if you filter by source first, then you will only receive back around 100 rows.

Here is the plan now:
                                                               QUERY PLAN                                                     
----------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..844.36 rows=1 width=74)
   ->  Index Scan using main_well2_the_geom_idx on main_well2  (cost=0.00..12.90 rows=3 width=55)
         Index Cond: (the_geom && 'SRID=-1;BOX3D(-142.633293514961 38.5465516731216 0,-51.4598436170248 76.5484665317199 0)'::geometry)
   ->  Index Scan using well_log_curve2_uwi_idx on well_log_curve2  (cost=0.00..306.22 rows=2 width=19)
         Index Cond: ("outer".uwi = well_log_curve2.uwi)
         Filter: (source = 'TIG600'::character varying)
(6 rows)

Also, there is no point in using the gist index first because EVERYTHING fits inside that box.
Here are the relevant parts to my .map file:

   DATA "the_geom from (select main_well2.the_geom as the_geom, main_well2.oid as oid, main_well2.uwi as uwi, well_log_curve2.source as source from main_well2 inner join well_log_curve2 on main_well2.uwi = well_log_curve2.uwi) as new_table using unique oid"
   FILTER "source = '%source%'"

I have tried the following:
SELECT update_geometry_stats('main_well2','the_geom');
VACUUM ANALYZE;

after this, the query still returns the same query plan.
How can I force it to filter by source before joining with main_well2. I am up for changing the DATA statement and/or removing the FILTER tag and incorporating it into the DATA statement somehow.

Thanks in advance,
Zamil
_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users