[postgis-devel] estimates problems and 1.0.0 delay

strk at refractions.net strk at refractions.net
Mon Apr 18 04:07:48 PDT 2005


I've committed Ron's patch, but making some tests I discovered some
corner cases still unandled.

It is collapsed histogram extent, mostly due to standard deviation based
hard deviants cut-off. Problem is that after cut-off of hard deviants
we end up with a 0-size dimension on Y or X.

This is surely NOT something that happens on a normal database usage
but I think we should farther inspect it and find a workaround.
Sounds as a can o worms so I wouldn't open it before 1.0.0 which
was planned for today, but I'd delay release until tomorrow evening
(CET) to allow a few more tests to be performed by Ron and eventually
Mark (and myself, of course).

Test reports welcome. Check out utils/test_joinestimation.pl.

--strk;

On Mon, Apr 18, 2005 at 11:47:19AM +0200, strk at refractions.net wrote:
> Patch sounds good, altought I think we can do something better
> in case number of table tuples are unknown (no vacuum run on them).
> 
> I'm doing some final run tests before committing it (1.0.0 will
> wait for this).
> 
> --strk;
> 
> On Mon, Apr 18, 2005 at 01:11:21AM -0700, Ron Mayer wrote:
> > While, I'm at it, I think this patch fixes both the problems.
> > 
> > In LWGEOM_gist_joinsel() inside #ifdef REALLY_DO_JOINSEL I think I see
> > two problems.
> > 
> > 1.  It can divide by zero, causing a
> >        "ERROR:  invalid join selectivity: inf"
> >     error.   I work around this by returning DEFAULT_GEOMETRY_SEL
> >     if the thing on the bottom in that division was zero.
> > 
> > 2.  I think it can do a "SearchSysCache()" and "get_attstatsslot()"
> >     for stats1_tuple without a corresponding "RelasesSysCache()" or
> >     "free_attstatsslot()" in the case when stats2_tuple has no
> >     statistics.   I work around this by putting in the
> >     calls to
> >      free_attstatsslot(0, NULL, 0, (float *)geomstats1, geomstats1_nvalues);
> >      ReleaseSysCache(stats1_tuple);
> >     in the places where I do an early return because of stats2_tuple.
> > 
> > 
> > Am I on the right track?   It seems to make the problem go
> > away for me, but I'm not that familiar with the code...
> > 
> >    Ron
> > 
> > 
> > 
> > rmap:~/apps/tmp/postgresql-8.0.2/contrib/postgis-cvs/lwgeom> diff -c 
> > lwgeom_estimate.c.~1~  lwgeom_estimate.c
> > *** lwgeom_estimate.c.~1~       Fri Mar 25 01:34:25 2005
> > --- lwgeom_estimate.c   Sun Apr 17 09:04:57 2005
> > ***************
> > *** 25,31 ****
> >   #include "liblwgeom.h"
> >   #include "lwgeom_pg.h"
> > 
> > ! //#define DEBUG_GEOMETRY_STATS 1
> > 
> > 
> >   #if USE_VERSION >= 80
> > --- 25,31 ----
> >   #include "liblwgeom.h"
> >   #include "lwgeom_pg.h"
> > 
> > ! #define DEBUG_GEOMETRY_STATS 1
> > 
> > 
> >   #if USE_VERSION >= 80
> > ***************
> > *** 867,872 ****
> > --- 867,874 ----
> >   #if DEBUG_GEOMETRY_STATS
> >                 elog(NOTICE, " No statistics, returning default estimate");
> >   #endif
> > +               free_attstatsslot(0, NULL, 0, (float *)geomstats1, 
> > geomstats1_nvalues);
> > +               ReleaseSysCache(stats1_tuple);
> >                 PG_RETURN_FLOAT8(DEFAULT_GEOMETRY_SEL);
> >         }
> > 
> > ***************
> > *** 878,883 ****
> > --- 880,887 ----
> >   #if DEBUG_GEOMETRY_STATS
> >                 elog(NOTICE, " STATISTIC_KIND_GEOMETRY stats not found - 
> >                 returning default geometry selectivity");
> >   #endif
> > +               free_attstatsslot(0, NULL, 0, (float *)geomstats1, 
> > geomstats1_nvalues);
> > +               ReleaseSysCache(stats1_tuple);
> >                 ReleaseSysCache(stats2_tuple);
> >                 PG_RETURN_FLOAT8(DEFAULT_GEOMETRY_SEL);
> >         }
> > ***************
> > *** 963,968 ****
> > --- 967,976 ----
> >         elog(NOTICE, "Estimated rows returned: %f", rows_returned);
> >   #endif
> > 
> > +         /* avoid a "ERROR:  invalid join selectivity: inf" */
> > +         if (total_tuples==0)
> > +             PG_RETURN_FLOAT8(DEFAULT_GEOMETRY_JOINSEL);
> > +
> >         PG_RETURN_FLOAT8(rows_returned / total_tuples);
> >   }
> > 
> > ====================================================================
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > Ron Mayer wrote:
> > >I have some queries that are giving me a
> > >  " WARNING:  cache reference leak: cache pg_statistic (31), tuple 0 has 
> > >count 1 "
> > >warning message in my log file when I run them.
> > >
> > >I'm using a couple days ago's (either thursday or friday's) CVS version.
> > >
> > >According to the wonderfully helpful people on postgresql's IRC,
> > > "the catcache issues that warning at the end of a transaction if
> > >  there are outstanding references to cached items ...
> > >  since it's pg_statistic that is being complained about, the most
> > >  likely culprit is a selectivity estimator function.. since those
> > >  use rows from pg_statistic to base estimates on"
> > >
> > >
> > >
> > >The problem goes away if I "ANALYZE" all my tables involved in the query.
> > >
> > >I can re-create the problem by doing TRUNCATE TABLE on the table
> > >wiith the MULTIPOLYGON and re-inserting the data, and running the
> > >query before analyzing that table.
> > >
> > >
> > >Below is
> > >   1.  An example query that shows the problem. It's joining a POINT
> > >       from the table point_features with a MULTIPOLYGON from the
> > >       table area_features.
> > >   2.  The output from select * from postgis_full_version();
> > >       and
> > >   3.  A previous message from this list that I think might be related;
> > >       since this is the same kind of query that was giving me
> > >       "ERROR:  invalid join selectivity: inf"
> > >       errors with RC4 when I had non-recently analyzed tables.
> > >
> > >   Ron
> > >
> > >===========================================================
> > >=== My problem query
> > >===========================================================
> > >fli=# select *
> > >        from userfeatures.point_features upf,
> > >        userfeatures.area_features uaf
> > >  where upf.the_geom && uaf.the_geom
> > >    and intersects(upf.the_geom,uaf.the_geom)
> > >    and uaf.featureid=18 limit 1
> > >fli-# fli-# fli-# fli-# fli-# ;
> > >WARNING:  cache reference leak: cache pg_statistic (31), tuple 0 has 
> > >count 1
> > > pointid | featureid | sessionid |   userid   | extid | label | iconid 
> > >|                  the_geom                  | entity_id | areaid | 
> > >featureid | session\
> > >id |   userid   | extid | label | color | bordercolor 
> > >|                                                                          
> > >\
> > >                                                                                                                                                              
> > >\
> > >                                                                                                                                                              
> > >\
> > >                                                                                                                                                              
> > >\
> > >  
> > >the_geom                                                                   
> > >\
> > >                                                                                                                                                              
> > >\
> > >                                                                                                                                                              
> > >\
> > >
> > >---------+-----------+-----------+------------+-------+-------+--------+--------------------------------------------+-----------+--------+-----------+--------\ 
> > >
> > >---+------------+-------+-------+-------+-------------+-------------------------------------------------------------------------------------------------------\ 
> > >
> > >--------------------------------------------------------------------------------------------------------------------------------------------------------------\ 
> > >
> > >--------------------------------------------------------------------------------------------------------------------------------------------------------------\ 
> > >
> > >--------------------------------------------------------------------------------------------------------------------------------------------------------------\ 
> > >
> > >--------------------------------------------------------------------------------------------------------------------------------------------------------------\ 
> > >
> > >--------------------------------------------------------------------------------------------------------------------------------------------------------------\ 
> > >
> > >--------------------------------------------------------------------------------------------------------------------------------------------------------------\ 
> > >
> > >------------------------------------------------------------------------------------------------------------------- 
> > >
> > >    3391 |        20 |         0 | 1635480986 | -1    |       |      1 | 
> > >010100000018E29102631355C0606B040DBF743E40 |     29646 |     57 |        
> > >18 |        \
> > > 0 | 1635480986 |       |       | green | green       | 
> > >0106000000010000000103000000010000002300000001ACAD4CE01255C02B7ED6AE7A723E407E0D4B0DFB1255C0E0E5C5FE7D\ 
> > >
> > >723E40BC2EF520151355C01929CF1E96723E403A4C21872D1355C049D49A21C2723E40B9F3C54F431355C0A4E22E5600733E40F87B94A4551355C0DC5692584E733E40185C35D1631355C054C75129\ 
> > >
> > >A9733E40E05C364A6D1355C00D80FD4A0D743E407F7566B2711355C08C147FE476743E40929B6ADE701355C07F9AF4E6E1743E40C34A68D66A1355C0771F9D354A753E40F85FB5D55F1355C070954D\ 
> > >
> > >CEAB753E40615C9048501355C06A33E0F002763E40537FF7C73C1355C067541A444C763E4035A3C713261355C0E9D49DF684763E4006B55C0B0D1355C005D8A1DAAA763E40E74FFCA4F21255C0833A\ 
> > >
> > >627BBC763E407C938711DA1255C0EBECE119C3763E40FF31EA50BF1255C03685F2C9BF763E40C110403DA51255C0FD41E9A9A7763E4043F313D78C1255C0CD961DA77B763E40C44B6F0E771255C072\ 
> > >
> > >8889723D763E4085C3A0B9641255C03A142670EF753E4065E3FF8C561255C0C2A3669F94753E409DE2FE134D1255C009EBBA7D30753E40FEC9CEAB481255C08A5639E4C6743E40EBA3CA7F491255C0\ 
> > >
> > >97D0C3E15B743E40BAF4CC874F1255C09F4B1B93F3733E4085DF7F885A1255C0A6D56AFA91733E401CE3A4156A1255C0AC37D8D73A733E402AC03D967D1255C0AF169E84F1723E40489C6D4A941255\ 
> > >
> > >C02D961AD2B8723E40778AD852AD1255C0119316EE92723E4096EF38B9C71255C09330564D81723E4001ACAD4CE01255C02B7ED6AE7A723E40 
> > >
> > >(1 row)
> > >
> > >fli=#
> > >fli=#
> > >
> > >============================================================================== 
> > >
> > >=========postgis_full_version()
> > >============================================================================== 
> > >
> > >fli=# select * from postgis_full_version();
> > >                                          postgis_full_version
> > >--------------------------------------------------------------------------------------------------------- 
> > >
> > > POSTGIS="1.0.0RC6" GEOS="2.1.1" PROJ="Rel. 4.4.9, 29 Oct 2004" 
> > >USE_STATS DBPROC="0.3.0" RELPROC="0.3.0"
> > >(1 row)
> > >
> > >fli=#
> > >
> > >
> > >
> > >============================================================================== 
> > >
> > >==== possibly related earlier conversation, that I don't remember
> > >==== the conclusion of.
> > >============================================================================== 
> > >
> > >
> > >(Re: [postgis-users] ERROR:  invalid join selectivity: inf)
> > >
> > >strk at refractions.net wrote:
> > >
> > >>Mark, can you look at this ?
> > >>
> > >>As a short-term workaround, in lwgeom_estimate.c line 100:
> > >>
> > >>#define REALLY_DO_JOINSEL 0
> > >>
> > >>--strk;
> > >>
> > >>On Fri, Mar 04, 2005 at 03:01:32AM -0800, Ron Mayer wrote:
> > >>
> > >>>Ok.. I think I can reproduce this:
> > >>>
> > >>>
> > >>>>  fl# select * from rtgr.landpoly,userfeatures.areas
> > >>>>      where rtgr.landpoly.the_geom && userfeatures.areas.the_geom;
> > >>>>  ERROR:  invalid join selectivity: inf
> > >>>
> > >>>
> > >>>pretty easily.
> > >>>
> > >>>Every time I "truncate table userfeatures.areas", and insert
> > >>>some areas; I'll get that error message until I "analyze areas".
> > >>>
> > >>>Seems like a bug when the stastics generated by analyze thinks
> > >>>that a table is empty while in reality it isn't.
> > >>>_______________________________________________
> > >>>postgis-users mailing list
> > >>>postgis-users at postgis.refractions.net
> > >>>http://postgis.refractions.net/mailman/listinfo/postgis-users
> > >
> > >
> > >_______________________________________________
> > >postgis-users mailing list
> > >postgis-users at postgis.refractions.net
> > >http://postgis.refractions.net/mailman/listinfo/postgis-users
> > 
> > _______________________________________________
> > postgis-users mailing list
> > postgis-users at postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-users
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel



More information about the postgis-devel mailing list