[postgis-users] question on gist performance

Stefan Zweig stefanzweig1881 at web.de
Mon Sep 10 02:40:12 PDT 2007


Mark,

> Yes. Number 1 is true because we have a RECHECK clause on the operator 
> class - it is this that provides the SRID checking since the SRID is 
> stored in the heap and not in the index.
> This might make sense since if you're storing country outlines then 
> the the geometries are likely to be quite large...

Right, RECHECK && seems to be enabled by default for gist indices. Why is that needed and how can I disable the RECHECK clause? In my case, SRID lookups are not necessary when a CHECK (srid(the_geom)=4326) constraint is used since all imported objects are guaranteed to have 4326 then, or am I wrong? I wonder what consequences would have changes or insertions of Geometry objects if I do disable the RECHECK clause? Aren't such changes stored into the gist index back again automatically?

I further still do not understand why the RECHECK times for the the_geom and bbox columns differ: It seems that the cached Geometry bounding box is *not* used and the whole Geometry object is scanned, *even if* hasbbox(the_geom) returns true for all objects. In my eyes this decreases strongly and unnecessarily the performance of certain bounding box queries.

Thank you!


=====

Kevin,

I followed your short instruction and got about the same measurement times. No other processes were running on the server at that time.

Any further ideas?

Thank you!

> -----Ursprüngliche Nachricht-----
> Von: PostGIS Users Discussion <postgis-users at postgis.refractions.net>
> Gesendet: 07.09.07 20:58:17
> An: PostGIS Users Discussion <postgis-users at postgis.refractions.net>
> Betreff: RE: [postgis-users] question on gist performance


> 
> On Fri, 2007-09-07 at 14:17 +0200, Stefan Zweig wrote:
> > Gregory (and others),
> > 
> > some questions occurred after reading your posts: 
> > 
> > 1. Is it true that each index match is revalidated by looking at the corresponding data row?
> > 2. Even it is true, both queries (based on the_geom or bbox, respecively) should then perform the *same* revalidation on the *same* (TOASTed) row data, resulting in identical query times?
> > 3. Could anybody clarify where and when index and row data is stored in the file system? How do I know what files contains what indices and row data?
> > 4. The bounding box cache should be inherent in the index, so hasBBOX(the_geom) should not play any role? (Btw, the result of hasBBOX(the_geom) is always true in our case, I even tried SELECT addbbox(the_geom) from mytable - without effect).
> > 
> > Thank you.
> 
> Yes. Number 1 is true because we have a RECHECK clause on the operator
> class - it is this that provides the SRID checking since the SRID is
> stored in the heap and not in the index. This might make sense since if
> you're storing country outlines then the the geometries are likely to be
> quite large...
> 
> 
> ATB,
> 
> Mark.
> 
> -- 
> ILande - Open Source Consultancy
> http://www.ilande.co.uk
> 
> 
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
> 


_____________________________________________________________________
Der WEB.DE SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen!
http://smartsurfer.web.de/?mc=100071&distributionid=000000000066


  From MAILER-DAEMON  Mon Sep 10 02:43:17 2007
Return-Path: <>
X-Original-To: postgis-users at postgis.refractions.net
Delivered-To: postgis-users at netnation.refractions.net
Received: from mout0.freenet.de (mout0.freenet.de [195.4.92.90])
	by netnation.refractions.net (Postfix) with ESMTP id E57E12C18D6
	for <postgis-users at postgis.refractions.net>;
	Mon, 10 Sep 2007 02:43:07 -0700 (PDT)
Received: from [195.4.92.20] (helo=mx10.freenet.de)
	by mout0.freenet.de with esmtpa (Exim 4.68-dev) id 1IUfpM-0002pe-I9
	for postgis-users at postgis.refractions.net;
	Mon, 10 Sep 2007 11:45:08 +0200
Received: from p549c2050.dip0.t-ipconnect.de ([84.156.32.80]:32773
	helo=dvise.alpstein.de)
	by mx10.freenet.de with esmtpa (ID alpstein1 at freenet.de) (port 25)
	(Exim 4.68-dev #12) id 1IUfpL-0007pd-B6
	for postgis-users at postgis.refractions.net;
	Mon, 10 Sep 2007 11:45:07 +0200
From: hubert.burger at alpstein.de
Subject: AUTOREPLY RE: [postgis-users] question on gist performance
To: postgis-users at postgis.refractions.net
Date: Mon, 10 Sep 2007 09:45:06 +0000
Priority: normal
X-Priority: 3 (Normal)
Importance: normal
X-Mailer: DvISE by Tobit Software, Germany (0244.444B44464847494F524A),
	Mime Converter 101.20
X-David-Sym: 0
X-David-Flags: 0
Message-ID: <0004B789.46E52E41 at dvise.alpstein.de>
MIME-Version: 1.0
Content-Type: text/plain;
 charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-BeenThere: postgis-users at postgis.refractions.net
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: PostGIS Users Discussion <postgis-users at postgis.refractions.net>
List-Id: PostGIS Users Discussion <postgis-users.postgis.refractions.net>
List-Unsubscribe: <http://postgis.refractions.net/mailman/listinfo/postgis-users>,
	<mailto:postgis-users-request at postgis.refractions.net?subject=unsubscribe>
List-Archive: <http://lists.refractions.net/pipermail/postgis-users>
List-Post: <mailto:postgis-users at postgis.refractions.net>
List-Help: <mailto:postgis-users-request at postgis.refractions.net?subject=help>
List-Subscribe: <http://postgis.refractions.net/mailman/listinfo/postgis-users>,
	<mailto:postgis-users-request at postgis.refractions.net?subject=subscribe>
X-List-Received-Date: Mon, 10 Sep 2007 09:43:17 -0000







Sehr geehrter Absender, 

vielen Dank für Ihre Email.

Leider kann ich Ihre Nachricht im Moment nicht persönlich
beantworten, da ich mich bis zum 21.09.2007 im Urlaub befinden. 

Nach meiner Rückkehr am 24.09.2007 werde ich gerne Ihre Anfrage beantworten. 


Bitte wenden Sie sich in dringenden Fällen an Herrn Jens Schwarz, jens.schwarz at alpstein.de oder per Telefon 08323-8006-0.

Mit freundlichen Grüßen

Hubert Burger
 

      
Alpstein GmbH                      
Missener Str. 18                   
87509 Immenstadt                   
                                 
fon  +49 8323 8006-0              
fax  +49 8323 8006-50             
 
web  www.alpstein.de

Mark,

> Yes. Number 1 is true because we have a RECHECK clause on the operator 
> class - it is this that provides the SRID checking since the SRID is 
> stored in the heap and not in the index.
> This might make sense since if you're storing country outlines then 
> the the geometries are likely to be quite large...

Right, RECHECK && seems to be enabled by default for gist indices. Why is that needed and how can I disable the RECHECK clause? In my case, SRID lookups are not necessary when a CHECK (srid(the_geom)=4326) constraint is used since all imported objects are guaranteed to have 4326 then, or am I wrong? I wonder what consequences would have changes or insertions of Geometry objects if I do disable the RECHECK clause? Aren't such changes stored into the gist index back again automatically?

I further still do not understand why the RECHECK times for the the_geom and bbox columns differ: It seems that the cached Geometry bounding box is *not* used and the whole Geometry object is scanned, *even if* hasbbox(the_geom) returns true for all objects. In my eyes this decreases strongly and unnecessarily the performance of certain bounding box queries.

Thank you!


=====

Kevin,

I followed your short instruction and got about the same measurement times. No other processes were running on the server at that time.

Any further ideas?

Thank you!

> -----Ursprüngliche Nachricht-----
> Von: PostGIS Users Discussion <postgis-users at postgis.refractions.net>
> Gesendet: 07.09.07 20:58:17
> An: PostGIS Users Discussion <postgis-users at postgis.refractions.net>
> Betreff: RE: [postgis-users] question on gist performance


> 
> On Fri, 2007-09-07 at 14:17 +0200, Stefan Zweig wrote:
> > Gregory (and others),
> > 
> > some questions occurred after reading your posts: 
> > 
> > 1. Is it true that each index match is revalidated by looking at the corresponding data row?
> > 2. Even it is true, both queries (based on the_geom or bbox, respecively) should then perform the *same* revalidation on the *same* (TOASTed) row data, resulting in identical query times?
> > 3. Could anybody clarify where and when index and row data is stored in the file system? How do I know what files contains what indices and row data?
> > 4. The bounding box cache should be inherent in the index, so hasBBOX(the_geom) should not play any role? (Btw, the result of hasBBOX(the_geom) is always true in our case, I even tried SELECT addbbox(the_geom) from mytable - without effect).
> > 
> > Thank you.
> 
> Yes. Number 1 is true because we have a RECHECK clause on the operator
> class - it is this that provides the SRID checking since the SRID is
> stored in the heap and not in the index. This might make sense since if
> you're storing country outlines then the the geometries are likely to be> quite large...
> 
> 
> ATB,
> 
> Mark.
> 
> -- 
> ILande - Open Source Consultancy
> http://www.ilande.co.uk
> 
> 
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
> 


_____________________________________________________________________
Der WEB.DE SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen!
http://smartsurfer.web.de/?mc=100071&distributionid=000000000066

_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users





More information about the postgis-users mailing list