[postgis-devel] estimates problems and 1.0.0 delay
strk at refractions.net
strk at refractions.net
Mon Apr 18 04:07:48 PDT 2005
I've committed Ron's patch, but making some tests I discovered some
corner cases still unandled.
It is collapsed histogram extent, mostly due to standard deviation based
hard deviants cut-off. Problem is that after cut-off of hard deviants
we end up with a 0-size dimension on Y or X.
This is surely NOT something that happens on a normal database usage
but I think we should farther inspect it and find a workaround.
Sounds as a can o worms so I wouldn't open it before 1.0.0 which
was planned for today, but I'd delay release until tomorrow evening
(CET) to allow a few more tests to be performed by Ron and eventually
Mark (and myself, of course).
Test reports welcome. Check out utils/test_joinestimation.pl.
--strk;
On Mon, Apr 18, 2005 at 11:47:19AM +0200, strk at refractions.net wrote:
> Patch sounds good, altought I think we can do something better
> in case number of table tuples are unknown (no vacuum run on them).
>
> I'm doing some final run tests before committing it (1.0.0 will
> wait for this).
>
> --strk;
>
> On Mon, Apr 18, 2005 at 01:11:21AM -0700, Ron Mayer wrote:
> > While, I'm at it, I think this patch fixes both the problems.
> >
> > In LWGEOM_gist_joinsel() inside #ifdef REALLY_DO_JOINSEL I think I see
> > two problems.
> >
> > 1. It can divide by zero, causing a
> > "ERROR: invalid join selectivity: inf"
> > error. I work around this by returning DEFAULT_GEOMETRY_SEL
> > if the thing on the bottom in that division was zero.
> >
> > 2. I think it can do a "SearchSysCache()" and "get_attstatsslot()"
> > for stats1_tuple without a corresponding "RelasesSysCache()" or
> > "free_attstatsslot()" in the case when stats2_tuple has no
> > statistics. I work around this by putting in the
> > calls to
> > free_attstatsslot(0, NULL, 0, (float *)geomstats1, geomstats1_nvalues);
> > ReleaseSysCache(stats1_tuple);
> > in the places where I do an early return because of stats2_tuple.
> >
> >
> > Am I on the right track? It seems to make the problem go
> > away for me, but I'm not that familiar with the code...
> >
> > Ron
> >
> >
> >
> > rmap:~/apps/tmp/postgresql-8.0.2/contrib/postgis-cvs/lwgeom> diff -c
> > lwgeom_estimate.c.~1~ lwgeom_estimate.c
> > *** lwgeom_estimate.c.~1~ Fri Mar 25 01:34:25 2005
> > --- lwgeom_estimate.c Sun Apr 17 09:04:57 2005
> > ***************
> > *** 25,31 ****
> > #include "liblwgeom.h"
> > #include "lwgeom_pg.h"
> >
> > ! //#define DEBUG_GEOMETRY_STATS 1
> >
> >
> > #if USE_VERSION >= 80
> > --- 25,31 ----
> > #include "liblwgeom.h"
> > #include "lwgeom_pg.h"
> >
> > ! #define DEBUG_GEOMETRY_STATS 1
> >
> >
> > #if USE_VERSION >= 80
> > ***************
> > *** 867,872 ****
> > --- 867,874 ----
> > #if DEBUG_GEOMETRY_STATS
> > elog(NOTICE, " No statistics, returning default estimate");
> > #endif
> > + free_attstatsslot(0, NULL, 0, (float *)geomstats1,
> > geomstats1_nvalues);
> > + ReleaseSysCache(stats1_tuple);
> > PG_RETURN_FLOAT8(DEFAULT_GEOMETRY_SEL);
> > }
> >
> > ***************
> > *** 878,883 ****
> > --- 880,887 ----
> > #if DEBUG_GEOMETRY_STATS
> > elog(NOTICE, " STATISTIC_KIND_GEOMETRY stats not found -
> > returning default geometry selectivity");
> > #endif
> > + free_attstatsslot(0, NULL, 0, (float *)geomstats1,
> > geomstats1_nvalues);
> > + ReleaseSysCache(stats1_tuple);
> > ReleaseSysCache(stats2_tuple);
> > PG_RETURN_FLOAT8(DEFAULT_GEOMETRY_SEL);
> > }
> > ***************
> > *** 963,968 ****
> > --- 967,976 ----
> > elog(NOTICE, "Estimated rows returned: %f", rows_returned);
> > #endif
> >
> > + /* avoid a "ERROR: invalid join selectivity: inf" */
> > + if (total_tuples==0)
> > + PG_RETURN_FLOAT8(DEFAULT_GEOMETRY_JOINSEL);
> > +
> > PG_RETURN_FLOAT8(rows_returned / total_tuples);
> > }
> >
> > ====================================================================
> >
> >
> >
> >
> >
> >
> >
> > Ron Mayer wrote:
> > >I have some queries that are giving me a
> > > " WARNING: cache reference leak: cache pg_statistic (31), tuple 0 has
> > >count 1 "
> > >warning message in my log file when I run them.
> > >
> > >I'm using a couple days ago's (either thursday or friday's) CVS version.
> > >
> > >According to the wonderfully helpful people on postgresql's IRC,
> > > "the catcache issues that warning at the end of a transaction if
> > > there are outstanding references to cached items ...
> > > since it's pg_statistic that is being complained about, the most
> > > likely culprit is a selectivity estimator function.. since those
> > > use rows from pg_statistic to base estimates on"
> > >
> > >
> > >
> > >The problem goes away if I "ANALYZE" all my tables involved in the query.
> > >
> > >I can re-create the problem by doing TRUNCATE TABLE on the table
> > >wiith the MULTIPOLYGON and re-inserting the data, and running the
> > >query before analyzing that table.
> > >
> > >
> > >Below is
> > > 1. An example query that shows the problem. It's joining a POINT
> > > from the table point_features with a MULTIPOLYGON from the
> > > table area_features.
> > > 2. The output from select * from postgis_full_version();
> > > and
> > > 3. A previous message from this list that I think might be related;
> > > since this is the same kind of query that was giving me
> > > "ERROR: invalid join selectivity: inf"
> > > errors with RC4 when I had non-recently analyzed tables.
> > >
> > > Ron
> > >
> > >===========================================================
> > >=== My problem query
> > >===========================================================
> > >fli=# select *
> > > from userfeatures.point_features upf,
> > > userfeatures.area_features uaf
> > > where upf.the_geom && uaf.the_geom
> > > and intersects(upf.the_geom,uaf.the_geom)
> > > and uaf.featureid=18 limit 1
> > >fli-# fli-# fli-# fli-# fli-# ;
> > >WARNING: cache reference leak: cache pg_statistic (31), tuple 0 has
> > >count 1
> > > pointid | featureid | sessionid | userid | extid | label | iconid
> > >| the_geom | entity_id | areaid |
> > >featureid | session\
> > >id | userid | extid | label | color | bordercolor
> > >|
> > >\
> > >
> > >\
> > >
> > >\
> > >
> > >\
> > >
> > >the_geom
> > >\
> > >
> > >\
> > >
> > >\
> > >
> > >---------+-----------+-----------+------------+-------+-------+--------+--------------------------------------------+-----------+--------+-----------+--------\
> > >
> > >---+------------+-------+-------+-------+-------------+-------------------------------------------------------------------------------------------------------\
> > >
> > >--------------------------------------------------------------------------------------------------------------------------------------------------------------\
> > >
> > >--------------------------------------------------------------------------------------------------------------------------------------------------------------\
> > >
> > >--------------------------------------------------------------------------------------------------------------------------------------------------------------\
> > >
> > >--------------------------------------------------------------------------------------------------------------------------------------------------------------\
> > >
> > >--------------------------------------------------------------------------------------------------------------------------------------------------------------\
> > >
> > >--------------------------------------------------------------------------------------------------------------------------------------------------------------\
> > >
> > >-------------------------------------------------------------------------------------------------------------------
> > >
> > > 3391 | 20 | 0 | 1635480986 | -1 | | 1 |
> > >010100000018E29102631355C0606B040DBF743E40 | 29646 | 57 |
> > >18 | \
> > > 0 | 1635480986 | | | green | green |
> > >0106000000010000000103000000010000002300000001ACAD4CE01255C02B7ED6AE7A723E407E0D4B0DFB1255C0E0E5C5FE7D\
> > >
> > >723E40BC2EF520151355C01929CF1E96723E403A4C21872D1355C049D49A21C2723E40B9F3C54F431355C0A4E22E5600733E40F87B94A4551355C0DC5692584E733E40185C35D1631355C054C75129\
> > >
> > >A9733E40E05C364A6D1355C00D80FD4A0D743E407F7566B2711355C08C147FE476743E40929B6ADE701355C07F9AF4E6E1743E40C34A68D66A1355C0771F9D354A753E40F85FB5D55F1355C070954D\
> > >
> > >CEAB753E40615C9048501355C06A33E0F002763E40537FF7C73C1355C067541A444C763E4035A3C713261355C0E9D49DF684763E4006B55C0B0D1355C005D8A1DAAA763E40E74FFCA4F21255C0833A\
> > >
> > >627BBC763E407C938711DA1255C0EBECE119C3763E40FF31EA50BF1255C03685F2C9BF763E40C110403DA51255C0FD41E9A9A7763E4043F313D78C1255C0CD961DA77B763E40C44B6F0E771255C072\
> > >
> > >8889723D763E4085C3A0B9641255C03A142670EF753E4065E3FF8C561255C0C2A3669F94753E409DE2FE134D1255C009EBBA7D30753E40FEC9CEAB481255C08A5639E4C6743E40EBA3CA7F491255C0\
> > >
> > >97D0C3E15B743E40BAF4CC874F1255C09F4B1B93F3733E4085DF7F885A1255C0A6D56AFA91733E401CE3A4156A1255C0AC37D8D73A733E402AC03D967D1255C0AF169E84F1723E40489C6D4A941255\
> > >
> > >C02D961AD2B8723E40778AD852AD1255C0119316EE92723E4096EF38B9C71255C09330564D81723E4001ACAD4CE01255C02B7ED6AE7A723E40
> > >
> > >(1 row)
> > >
> > >fli=#
> > >fli=#
> > >
> > >==============================================================================
> > >
> > >=========postgis_full_version()
> > >==============================================================================
> > >
> > >fli=# select * from postgis_full_version();
> > > postgis_full_version
> > >---------------------------------------------------------------------------------------------------------
> > >
> > > POSTGIS="1.0.0RC6" GEOS="2.1.1" PROJ="Rel. 4.4.9, 29 Oct 2004"
> > >USE_STATS DBPROC="0.3.0" RELPROC="0.3.0"
> > >(1 row)
> > >
> > >fli=#
> > >
> > >
> > >
> > >==============================================================================
> > >
> > >==== possibly related earlier conversation, that I don't remember
> > >==== the conclusion of.
> > >==============================================================================
> > >
> > >
> > >(Re: [postgis-users] ERROR: invalid join selectivity: inf)
> > >
> > >strk at refractions.net wrote:
> > >
> > >>Mark, can you look at this ?
> > >>
> > >>As a short-term workaround, in lwgeom_estimate.c line 100:
> > >>
> > >>#define REALLY_DO_JOINSEL 0
> > >>
> > >>--strk;
> > >>
> > >>On Fri, Mar 04, 2005 at 03:01:32AM -0800, Ron Mayer wrote:
> > >>
> > >>>Ok.. I think I can reproduce this:
> > >>>
> > >>>
> > >>>> fl# select * from rtgr.landpoly,userfeatures.areas
> > >>>> where rtgr.landpoly.the_geom && userfeatures.areas.the_geom;
> > >>>> ERROR: invalid join selectivity: inf
> > >>>
> > >>>
> > >>>pretty easily.
> > >>>
> > >>>Every time I "truncate table userfeatures.areas", and insert
> > >>>some areas; I'll get that error message until I "analyze areas".
> > >>>
> > >>>Seems like a bug when the stastics generated by analyze thinks
> > >>>that a table is empty while in reality it isn't.
> > >>>_______________________________________________
> > >>>postgis-users mailing list
> > >>>postgis-users at postgis.refractions.net
> > >>>http://postgis.refractions.net/mailman/listinfo/postgis-users
> > >
> > >
> > >_______________________________________________
> > >postgis-users mailing list
> > >postgis-users at postgis.refractions.net
> > >http://postgis.refractions.net/mailman/listinfo/postgis-users
> >
> > _______________________________________________
> > postgis-users mailing list
> > postgis-users at postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-users
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
More information about the postgis-devel
mailing list