<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE>[postgis-users] Looking for more performance on select</TITLE>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.2800.1578" name=GENERATOR></HEAD>
<BODY>
<DIV id=idOWAReplyText2005 dir=ltr>
<DIV dir=ltr><FONT face=Arial><FONT size=2>I just noticed something else in
your <SPAN class=725383020-26122006>explain</SPAN>. Its doing a seq
scan on your <FONT face="Times New Roman">streets_db_edges. I would have
expected that to be an index scan.</FONT></FONT></FONT></DIV>
<DIV dir=ltr><FONT size=2></FONT> </DIV>
<DIV dir=ltr><FONT size=2>Are the data types of your id and edge_id of the same
type<SPAN class=725383020-26122006>?</SPAN> If they aren't then it would do a
seq scan instead of indexed scan.<SPAN class=725383020-26122006> I know at
least in earlier versions if you had an int8 in one table and an int4 in
another, it would not do an index scan. This might have changed in the 8.2
version.</SPAN></FONT></DIV></DIV>
<DIV dir=ltr><BR>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> Obe, Regina<BR><B>Sent:</B> Tue 12/26/2006
2:44 PM<BR><B>To:</B> ericfrancois@hotmail.com; PostGIS Users
Discussion<BR><B>Subject:</B> RE: [postgis-users] Looking for more performance
on select<BR></FONT><BR></DIV>
<DIV>
<DIV id=idOWAReplyText72144 dir=ltr>
<DIV dir=ltr><FONT face=Arial size=2>I would change your index to a clustered
index on your id field in streets_db_edges</FONT></DIV>
<DIV dir=ltr><FONT face=Arial size=2></FONT> </DIV>
<DIV dir=ltr><FONT face=Arial size=2>For your other table streets_db_big - I'm
guessing a clustered index on your the_geom field would give you better
performance than a cluster on the edge_id field. Having clustered indexes
has improved my speeds a lot.</FONT></DIV>
<DIV dir=ltr><FONT face=Arial size=2></FONT> </DIV></DIV>
<DIV dir=ltr><BR>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B>
postgis-users-bounces@postgis.refractions.net on behalf of Eric
FRANCOIS<BR><B>Sent:</B> Tue 12/26/2006 10:12 AM<BR><B>To:</B>
postgis-users@postgis.refractions.net<BR><B>Subject:</B> [postgis-users] Looking
for more performance on select<BR></FONT><BR></DIV>
<DIV>
<P><FONT size=2>Hello,<BR><BR>I'm looking for help to improve postgres
performance query.<BR><BR>I've 2 tables:<BR>streets_db_egdes with 6 millions
record, field 'id' is the primary of the<BR>table<BR>streets_db_big with 800000
records with an index on a field named edge_id<BR>and on a geometry field name
the_geom.<BR><BR>vaccum analyse is run on the database twice a day.<BR><BR>The
following query is runned in 15 seconds:<BR>SELECT gid as id FROM
streets_db_big as m<BR>where setsrid('BOX3D(1.65924 43.38176,3.16274
50.73713)'::BOX3D,4326) &&<BR>m.the_geom<BR><BR><BR><BR><BR>The query
with a join between the 2 table runs in near 60s:<BR>SELECT gid as id,
source::integer, target::integer, cost::double precision<BR>as
cost,reverse_cost::double precision as reverse_cost FROM streets_db_edges<BR>as
s,streets_db_big as m<BR>where m.edge_id=s.id and setsrid('BOX3D(1.65924
43.38176,3.16274<BR>50.73713)'::BOX3D,4326) && m.the_geom<BR><BR>the
explain:<BR><BR>Hash Join (cost=85806.40..329366.66 rows=174676 width=28)
(actual<BR>time=22946.953..56012.747 rows=164889 loops=1)<BR> Hash Cond:
(s.id = m.edge_id)<BR> -> Seq Scan on streets_db_edges s
(cost=0.00..153395.40 rows=5894540<BR>width=28) (actual time=16.912..17790.444
rows=5894540 loops=1)<BR> -> Hash (cost=85369.71..85369.71
rows=174676 width=8) (actual<BR>time=22898.034..22898.034 rows=164889
loops=1)<BR> -> Bitmap Heap
Scan on streets_db_big m (cost=5530.26..85369.71<BR>rows=174676 width=8)
(actual time=5446.865..22464.712 rows=164889
loops=1)<BR>
Filter:<BR>('0103000020E61000000100000005000000CE70033E3F8CFA3FE275FD82DDB04540CE70033E<BR>3F8CFA3F43739D465A5E494021020EA14A4D094043739D465A5E494021020EA14A4D0940E275<BR>FD82DDB04540CE70033E3F8CFA3FE275FD82DDB04540'::geometry
&&
the_geom)<BR>
-> Bitmap Index Scan on
streets_db_big_the_geom_idx<BR>(cost=0.00..5530.26 rows=174676 width=0) (actual
time=5437.166..5437.166<BR>rows=164889
loops=1)<BR>
Index
Cond:<BR>('0103000020E61000000100000005000000CE70033E3F8CFA3FE275FD82DDB04540CE70033E<BR>3F8CFA3F43739D465A5E494021020EA14A4D094043739D465A5E494021020EA14A4D0940E275<BR>FD82DDB04540CE70033E3F8CFA3FE275FD82DDB04540'::geometry
&& the_geom)<BR>Total runtime: 56367.929 ms<BR><BR><BR>Is there a way to
increase my performances in this type of select queries?<BR><BR><BR>Thanks for
help<BR><BR>ERIC<BR><BR>My conf:<BR><BR>Toshiba Tecra A4 with 1 Go Ram running
PostgreSQl 8.2<BR>Database on external disk, pg_xlog on local
disk<BR><BR>shared_buffers =
228MB<BR>work_mem=50MB<BR>maintenance_work_mem=100MB<BR>effective_cache_size =
500MB<BR><BR>_______________________________________________<BR>postgis-users
mailing list<BR>postgis-users@postgis.refractions.net<BR><A
href="http://postgis.refractions.net/mailman/listinfo/postgis-users">http://postgis.refractions.net/mailman/listinfo/postgis-users</A><BR></FONT></P></DIV></DIV></BODY></HTML>
<HTML><BODY><P><hr size=1></P><br>
<P><STRONG><br>
The substance of this message, including any attachments, may be<br>
confidential, legally<br>
privileged and/or exempt from disclosure pursuant to Massachusetts<br>
law. It is intended<br>
solely for the addressee. If you received this in error, please<br>
contact the sender and<br>
delete the material from any computer.<br>
</STRONG></P></BODY></HTML>