[Geoserver-devel] [postgis-users] Re: Postgis estimated_extent completely off the mark?

Mark Cave-Ayland mark.cave-ayland at ilande.co.uk
Wed Mar 21 13:03:21 PDT 2007


On Tue, 2007-03-20 at 16:20 +0100, Andrea Aime wrote:
> Paul Ramsey ha scritto:
> > Right, sampling.
> > Small enough that a random sample has a chance of missing them.   
> > Northern islands in Alaska, Hawaii, etc.
> 
> So this is a feature, not a bug, apparently.
> Heh, then the docs should say that estimated_extent is 5% off the
> proper bounds if features are uniformly distributed in the actual
> bounds :-)
> If you have data with strange distribution patters (such as USA
> states) better not rely on it.
> 
> Cheers
> Andrea

Yes, it's due to the way in which the sampling works. Note that you can
increase the number of sampled rows using ALTER TABLE x ALTER COLUMN y
SET STATISTICS z and then re-ANALYZING (the default value is 10, so
perhaps a value of 100 would provide better results). I'm not sure where
the figure of 5% from proper bounds comes from though - I would have
imagined it depends on the sample size relative to the population size,
but then I haven't studied statistics properly for several years now :(


Kind regards,

Mark.





More information about the postgis-users mailing list