[postgis-devel] CLUSTER in 8.3

Obe, Regina robe.dnd at cityofboston.gov
Sun Dec 7 12:36:14 PST 2008

Just filed a bug report about this.  I'm convinced its something that is real and should be fixed right away since it means possible loss of data.  I was successful in recreating the issue with Kevin's example finally (well I guess misery loves company).

Bug report is :Clustering on GIST INDEX clobbers records in table intermittently
(observed on 8.3.5 installs)
--Reference number: 4567

This doesn't always happen to me but does intermittently, and for others it happens all the time.  Several of us have tried to recreate the issue.  For me it happens once in a while on EL 4, 8.3.5 install.  We have confirmed it happens on all GIST type indexes.

We have not noticed this problem prior to 8.3.5  -- checkout this thread for details.



To recreate: 
1) restart your postgresql service
2) run below

test=# create temp table tmp as select st_makepoint(random(), random()) as the_geom from generate_series(1, 10000);
test=# create index tmp_geom_idx on tmp using gist (the_geom);
test=# analyze tmp;
test=# select count(*) from tmp;
(1 row)

test=# cluster tmp using tmp_geom_idx;
test=# analyze tmp;
test=# select count(*) from tmp;
(1 row)

Running the same exercise for me on 8.3.1 always seems to work correctly as far as I can tell.

-----Original Message-----
From: postgis-devel-bounces at postgis.refractions.net on behalf of Kevin Neufeld
Sent: Fri 12/5/2008 3:56 PM
To: PostGIS Development Discussion
Subject: Re: [postgis-devel] CLUSTER in 8.3

I've tried numerous permutations of index type on various data types.

I can't reproduce the problem a btree index on any datatype, but the problem is repeatable on
- our gist,
- btree_gist on integers,
- btree_gist on characters, and
- btree_gist on text using varchar_pattern_ops.

-- Kevin

Chris Hodgson wrote:
> Kevin Neufeld wrote:
>> Mark Cave-Ayland wrote:
>>  > I think the GiST part is a red herring - there is no way that a 
>>> COUNT(*) FROM foo" can use an index in PostgreSQL. My suspicion is 
>>> that it's related to the use of ANALYZE/VACUUM/CLUSTER.
>> Maybe.  You're right that "SELECT ..." doesn't use the index, but the 
>> CLUSTER physically reorders the table based on the index.  If there is 
>> a bizzare bug in our GiST implementation that produces an empty 
>> traversal list some of the time, the table would be empty.
>> Never mind, I just saw Paul's post.  This is good news for us in that 
>> it's not related to our implementation of GiST ...  but it still could 
>> be related to PostgreSQL's implementation, no?
> Well, can anyone replicate the problem by clustering on a non-gist 
> index? If so, then we should really be able to just throw this problem 
> at the Postgres Devs. I'd be surprised if this was the case though, I 
> can't believe no-one else would have come across this yet...
> Chris
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
postgis-devel mailing list
postgis-devel at postgis.refractions.net

The substance of this message, including any attachments, may be
confidential, legally privileged and/or exempt from disclosure
pursuant to Massachusetts law. It is intended
solely for the addressee. If you received this in error, please
contact the sender and delete the material from any computer.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20081207/98500515/attachment.html>

More information about the postgis-devel mailing list