[postgis-users] question on gist performance

Stefan Zweig stefanzweig1881 at web.de
Fri Sep 7 05:17:32 PDT 2007


Gregory (and others),

some questions occurred after reading your posts: 

1. Is it true that each index match is revalidated by looking at the corresponding data row?
2. Even it is true, both queries (based on the_geom or bbox, respecively) should then perform the *same* revalidation on the *same* (TOASTed) row data, resulting in identical query times?
3. Could anybody clarify where and when index and row data is stored in the file system? How do I know what files contains what indices and row data?
4. The bounding box cache should be inherent in the index, so hasBBOX(the_geom) should not play any role? (Btw, the result of hasBBOX(the_geom) is always true in our case, I even tried SELECT addbbox(the_geom) from mytable - without effect).

Thank you.


> -----Ursprüngliche Nachricht-----
> Von: PostGIS Users Discussion <postgis-users at postgis.refractions.net>
> Gesendet: 07.09.07 13:27:40
> An:  "PostGIS Users Discussion" <postgis-users at postgis.refractions.net>
> Betreff: RE: [postgis-users] question on gist performance


> 
> I may be wrong, but I think that once PostgreSQL gets a match from the index, it has to go to disk to "make sure" that the data is valid (i.e. visible given the rules of the MVCC environment); indexes don't store info on concurrency of the data as does Oracle's (as I understand it -- retreaded Informix guy myself). Perhaps that accounts for some of the timing difference ?
> 
> GSW
> -----Original Message-----
> From: postgis-users-bounces at postgis.refractions.net on behalf of Stefan Zweig
> Sent: Fri 9/7/2007 5:21 AM
> To: PostGIS Users Discussion
> Subject: RE: [postgis-users] question on gist performance
> 
> Gregory,
> 
> Sorry, but I don't understand your point. We added the GIST index to speed up the bounding box based queries. Thus, scanning the raw geometry data is not necessary since the GIST index already contains all information needed to execute the query, doesn't it?
> 
> > -----Ursprüngliche Nachricht-----
> > Von: PostGIS Users Discussion <postgis-users at postgis.refractions.net>
> > Gesendet: 07.09.07 12:21:26
> > An:  "PostGIS Users Discussion" <postgis-users at postgis.refractions.net>
> > Betreff: RE: [postgis-users] question on gist performance
> 
> 
> >
> > To my untutored eye it looks like the slower one is using the geometry ("index cond: (the_geom &&" vs "index cond: (bbox &&") ... a bounding box has 4 count 'em 4 points (not sure if all are even stored) while your geometry has way more than that. So the comparison of the simple is *way* faster. As it should be.
> >
> > QED ?
> >
> > Greg Williamson
> > Senior DBA
> > GlobeXplorer LLC, a DigitalGlobe company
> >
> > Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information and must be protected in accordance with those provisions. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
> >
> > (My corporate masters made me say this.)
> >
> >
> >
> > -----Original Message-----
> > From: postgis-users-bounces at postgis.refractions.net on behalf of Stefan Zweig
> > Sent: Fri 9/7/2007 3:36 AM
> > To: PostGIS Users Discussion
> > Subject: Re: [postgis-users] question on gist performance
> >
> > hi frank,
> >
> > this what analyze says:
> >
> > "Index Scan using mytable_the_geom_gist on mytable  (cost=0.00..4.95 rows=1 width=13)"
> > "  Index Cond: (the_geom && '0103000020E610000001000000050000000000004077852B40000000C0E69549400000004077852B40000000002FEB4940000000C0AE762E40000000002FEB4940000000C0AE762E40000000C0E69549400000004077852B40000000C0E6954940'::geometry)"
> > "  Filter: (the_geom && '0103000020E610000001000000050000000000004077852B40000000C0E69549400000004077852B40000000002FEB4940000000C0AE762E40000000002FEB4940000000C0AE762E40000000C0E69549400000004077852B40000000C0E6954940'::geometry)"
> >
> > "Index Scan using mytable_bbox_gist on mytable  (cost=0.00..4.95 rows=1 width=13)"
> > "  Index Cond: (bbox && '0103000020E610000001000000050000000000004077852B40000000C0E69549400000004077852B40000000002FEB4940000000C0AE762E40000000002FEB4940000000C0AE762E40000000C0E69549400000004077852B40000000C0E6954940'::geometry)"
> > "  Filter: (bbox && '0103000020E610000001000000050000000000004077852B40000000C0E69549400000004077852B40000000002FEB4940000000C0AE762E40000000002FEB4940000000C0AE762E40000000C0E69549400000004077852B40000000C0E6954940'::geometry)"
> >
> > still strange.. both querys are using the same index, but there is a dramatically difference in performance. could it be that the first query on the_geom column needs to parse the geometry(s) to gather bounding box information from them (which i actually would not have thought so)?
> >
> > kind regards, stefan
> >
> >
> > > -----Ursprüngliche Nachricht-----
> > > Von: PostGIS Users Discussion <postgis-users at postgis.refractions.net>
> > > Gesendet: 07.09.07 10:44:16
> > > An: PostGIS Users Discussion <postgis-users at postgis.refractions.net>
> > > Betreff: Re: [postgis-users] question on gist performance
> >
> >
> > >
> > > Stefan,
> > >
> > > * Stefan Zweig <stefanzweig1881 at web.de> [070907 10:07]:
> > > > thanks for your quick reply. i have tried your advice and rebuilded
> > > > all my gist indexes on the table (via drop index, create index) and
> > > > run a vacuum full afterwards. but that did not change the fact, that
> > > > the query on the_geom is much slower than the same one on the_geom_1.
> > > >
> > > [...]
> > > >
> > > >   '...WHERE the_geom && expr;' takes about 250ms,
> > > >   '...WHERE the_geom_envelope && expr;' takes about 20ms.
> > > >
> > > > ANALYZE says that GIST is used. As far as I know, the GIST makes use
> > > > only of the bounding boxes of Geometry objects, which are equal for
> > > > each object. How does it come that the query time using the_geom is more than
> > > > *ten times slower* than the query time using the_geom_envelope?"
> > >
> > > It would help if you post the explain analyse output.
> > >
> > > Regards,
> > >
> > >         Frank
> > >
> > > --
> > > Frank Koormann    |   ++49-541-335 08 30    |   http://www.intevation.net/
> > > Intevation GmbH, Osnabrück, DE | Commercial Register Osnabrück, HR B 18998
> > > Managing Directors: Frank Koormann, Bernhard Reiter, Dr. Jan-Oliver Wagner
> > > PostGIS Support (http://www.intevation.net/geospatial/postgis-support.en.html)
> > > _______________________________________________
> > > postgis-users mailing list
> > > postgis-users at postgis.refractions.net
> > > http://postgis.refractions.net/mailman/listinfo/postgis-users
> > >
> >
> >
> > __________________________________________________________________________
> > Erweitern Sie FreeMail zu einem noch leistungsstärkeren E-Mail-Postfach!              
> > Mehr Infos unter http://produkte.web.de/club/?mc=021131
> >
> > _______________________________________________
> > postgis-users mailing list
> > postgis-users at postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-users
> >
> >
> > -----------------------------------------------------------------
> > _______________________________________________
> > postgis-users mailing list
> > postgis-users at postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-users
> >
> 
> 
> _____________________________________________________________________
> Der WEB.DE SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen!
> http://smartsurfer.web.de/?mc=100071&distributionid=000000000066
> 
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
> 
> 
> -----------------------------------------------------------------
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
> 


______________________________________________________________________________
Jetzt neu! Im riesigen WEB.DE Club SmartDrive Dateien freigeben und mit 
Freunden teilen! http://www.freemail.web.de/club/smartdrive_ttc.htm/?mc=021134




More information about the postgis-users mailing list