[postgis-devel] Query causes Postgres to crash

strk at refractions.net strk at refractions.net
Fri Dec 10 07:29:36 PST 2004


On Fri, Dec 10, 2004 at 09:50:26AM -0000, Mark Cave-Ayland wrote:
> 
> > -----Original Message-----
> > From: postgis-devel-bounces at postgis.refractions.net
> > [mailto:postgis-devel-bounces at postgis.refractions.net] On 
> > Behalf Of strk at refractions.net
> > Sent: 09 December 2004 18:53
> > To: PostGIS Development Discussion
> > Subject: Re: [postgis-devel] Query causes Postgres to crash
> 
> (cut)
> 
> > > UPDATE grid10m SET
> > > lzmin = (SELECT min(lz) from unfilteredpts WHERE lreturn  &&
> > > the_geom), lzmax  = (SELECT max(lz) from unfilteredpts 
> > WHERE lreturn
> > > && the_geom), lzavg  = (SELECT avg(lz) from unfilteredpts WHERE
> > > lreturn  && the_geom), lzstddev  = (SELECT stddev(lz) from 
> > > unfilteredpts WHERE lreturn  && the_geom);
> > > 
> > > grid10m is a set of 11,000 square polygons.  unfilteredpts
> > is as set
> > > of
> > > 2.5 million points (2 geometry columns lreturn, freturn).
> > > 
> > > I am trying an explain analyze on a subset (1,700 grid
> > polys & 400,000
> > > pts), but it has been 2 hours and I still do not have a result.
> > > 
> > > Any thoughts? Thanks,
> > 
> > I'd concentrate to explain analyze not returning.
> > Define DEBUG_GEOMETRY_STATS in lwgeom/lwgeom_estimate.c
> > and see what goes on.
> > --strk;
> 
> 
> Hi strk,
> 
> Unfortunately PostGIS doesn't currently return estimates for joins of the
> type <column> && <column>, only <column> && <constant>. This has been on my
> TODO list for a while since it is likely that we would use this type of
> query on large tables containing millions of records on our geospatial
> database - unfortunately I have been needed to work on other non-spatial
> projects for some time which is why I've been rather quiet here :)
> 
> My guess is that this part of the query is performing a sequential scan:
> 
> 	SELECT min(lz) from unfilteredpts WHERE lreturn && the_geom
> 
> ... And unfortunately in the query above it does this for each row in the
> grid10m table. At the moment the join selectivity function, positionjoinsel,
> is hard-wired in src/backend/utils/adt/geo_selfuncs.c to be 0.1(!) and this
> is the join function specified in the operator definition for &&.

Reading that file, I've seen that 'areajoinsel' is suggested for 'overlap'
operators. That function returns 0.005 instead of 0.1.
It doesn't mean we don't have to develop a custom one, but I'm curious
to see Collin's results using areajoinsel instead of positionjoinsel.

UPDATE pg_operator SET oprjoin = 'areajoinsel'::regproc
WHERE oprname = '&&' AND oprcode = 'geometry_overlap'::regproc;

Also, a wrong estimation should no justify the crash.
I suggest Collin estimates a single field update and show us
the result.

--strk;


> 
> What we need to do is define our own JOIN clause estimation function in the
> declaration of the && operator instead. This should use the geometry
> statistics to calculate the intersection area of the two columns and return
> an estimate as to how many geometries are returned within that area.
> Unfortunately, unless anyone else beats me to this, I'm won't be able to get
> around to this for a while :(
> 
> 
> Kind regards,
> 
> Mark.
> 
> ------------------------
> WebBased Ltd
> South West Technology Centre
> Tamar Science Park
> Plymouth
> PL6 8BT 
> 
> T: +44 (0)1752 791021
> F: +44 (0)1752 791023
> W: http://www.webbased.co.uk
> 
> 
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel



More information about the postgis-devel mailing list