API for optimized predicates (was Re: [postgis-devel] 1.3.3)

Paul Ramsey pramsey at cleverelephant.ca
Mon Mar 31 10:22:55 PDT 2008


A unique-on-insert ID would be another approach. It would, however,
involve a disk-format change, so we're talking about pretty big
hammers here, regardless of whether we did a hash or a uuid.

Ben, maybe just stick some small timing statements into your current
code... start time, end time, and then do a noop memcmp with start/end
times as well. That way we can compare the memcmp times to the total
times.

P.

On Mon, Mar 31, 2008 at 10:17 AM, Martin Davis <mbdavis at refractions.net> wrote:
> (renaming this thread, since the current one is way overloaded)
>
>  I agree with Paul and Mark - there should be a simple function signature
>  for the fast preds.  The more complex one can be provided as well, but
>  it will need to be VERY well documented, since it's a tricky thing to grok.
>
>  re spatial hash - would you really trust a hash to confirm identity?  I
>  don't think I would...
>
>  Would another alternative would be to assign a hidden unique ID to each
>  geom entered into the DB.  This could be a honking big integer, or maybe
>  a UUID.
>
>  Paul Ramsey wrote:
>  > The problem is that the memcmp hit gets worse in exactly the cases
>  > were we expect better and better performance from the prepared
>  > algorithm...  still, it would be nice to know what that hit is...
>  > compared to the original, unprepared time, it will be small, but
>  > compared to the prepared-with-id-API implementation... hard to say.
>  >
>  > Something to resolve before 1.4... It's unfortunate that all the
>  > *fast* tests can only falsify identity, not confirm it.  I was talking
>  > to a fellow who has done a spatial db implementation on a proprietary
>  > system, and he was pleased with the idea of a "geographic hash" that
>  > he can calculate for each shape and use to test identity.  If we were
>  > to do something like that, it would have to be optional, like the bbox
>  > calculation is currently.
>  >
>  > P.
>  >
>  > On Mon, Mar 31, 2008 at 2:51 AM, Mark Cave-Ayland
>  > <mark.cave-ayland at siriusit.co.uk> wrote:
>  >
>  >> On Friday 28 March 2008 23:53:53 Ben Jubb wrote:
>  >>  > Howdy,
>  >>  > In my testing, I did see a performance hit when using the memcmp test,
>  >>  > although it was noticable only in the largest of my test geometries
>  >>  > (5000 vertices or so).
>  >>  > The three parameter form seemed like the best way to go because the
>  >>  > whole point of the prepared version of the functions was to get the best
>  >>  > possible performance.  The cases when the performance matters most is
>  >>  > with large geoms, and then the cost of doing the memcmp is the highest.
>  >>  > Using a third argument seemed the simplest way to get the best possible
>  >>  > performance from the predicates, with a minimal increase in the
>  >>  > complexity of the interface.
>  >>  > I agree it would be nice to have a single form for those predicates that
>  >>  > automatically determines the most efficient manner to do the tests, but
>  >>  > there didn't seem to be any efficient way to accomplish that.
>  >>  >
>  >>  > b
>  >>
>  >>
>  >>  Hi Ben,
>  >>
>  >>  Well I think it really comes down to what exactly is the performance hit and
>  >>  how did you measure it? Which platform/OS/C library did you use? Obviously
>  >>  there will be *some* overhead having the extra memcmp() in there but does it
>  >>  matter? For example, if the overhead is just 1-2s on a 30s query then that
>  >>  doesn't really matter. Then again, if the overhead is 1s on a 3s query then
>  >>  that is significant.
>  >>
>  >>  Since this is a new feature then I'd be inclined to say that for a first cut
>  >>  we should keep the standard API, and depending on the reports we get back,
>  >>  look at improving it later. That seems a lot more preferable to having a
>  >>  fairly nasty API hack that will catch a lot of people out :(
>  >>
>  >>
>  >>
>  >>  ATB,
>  >>
>  >>  Mark.
>  >>
>  >>  --
>  >>  Mark Cave-Ayland
>  >>  Sirius Corporation - The Open Source Experts
>  >>  http://www.siriusit.co.uk
>  >>  T: +44 870 608 0063
>  >>  _______________________________________________
>  >>  postgis-devel mailing list
>  >>  postgis-devel at postgis.refractions.net
>  >>  http://postgis.refractions.net/mailman/listinfo/postgis-devel
>  >>
>  >>
>  > _______________________________________________
>  > postgis-devel mailing list
>  > postgis-devel at postgis.refractions.net
>  > http://postgis.refractions.net/mailman/listinfo/postgis-devel
>  >
>  >
>
>  --
>  Martin Davis
>  Senior Technical Architect
>  Refractions Research, Inc.
>  (250) 383-3022
>
>  _______________________________________________
>  postgis-devel mailing list
>  postgis-devel at postgis.refractions.net
>  http://postgis.refractions.net/mailman/listinfo/postgis-devel
>



More information about the postgis-devel mailing list