API for optimized predicates (was Re: [postgis-devel] 1.3.3)
Ben Jubb
benjubb at refractions.net
Tue Apr 1 17:12:31 PDT 2008
for the 3 param version, where you using an integer key, or NULL?
b
Paul Ramsey wrote:
> I gave this a try, but in the three-parameter case it caused the
> backend to crash and in the two-parameter case provided the same speed
> as the usual un-prepared approach...
>
> I was testing with st_contains(polycolumn, pointcolumn), with 80 polys
> and 7000 points.
>
> P
>
> On Mon, Mar 31, 2008 at 3:50 PM, Ben Jubb <benjubb at refractions.net> wrote:
>
>> Hiya,
>> I've attached a patch to lwgeom_geos_c.c, modifying its 1st arg caching
>> behaviour.
>> The third argument is used as before, as a surrogate key, and the caching
>> will use that as its key;
>> UNLESS the key is NULL.
>> If the key is NULL, the predicates use the memcmp technique to determine if
>> the cached prepared geometry is in sync with the first arg.
>> Note that the two caching approaches have essentially independent caches.
>> This patch is intended for testing purposes only.
>> enjoy
>> b
>>
>>
>>
>> Paul Ramsey wrote:
>> A unique-on-insert ID would be another approach. It would, however,
>> involve a disk-format change, so we're talking about pretty big
>> hammers here, regardless of whether we did a hash or a uuid.
>>
>> Ben, maybe just stick some small timing statements into your current
>> code... start time, end time, and then do a noop memcmp with start/end
>> times as well. That way we can compare the memcmp times to the total
>> times.
>>
>> P.
>>
>> On Mon, Mar 31, 2008 at 10:17 AM, Martin Davis <mbdavis at refractions.net>
>> wrote:
>>
>>
>> (renaming this thread, since the current one is way overloaded)
>>
>> I agree with Paul and Mark - there should be a simple function signature
>> for the fast preds. The more complex one can be provided as well, but
>> it will need to be VERY well documented, since it's a tricky thing to grok.
>>
>> re spatial hash - would you really trust a hash to confirm identity? I
>> don't think I would...
>>
>> Would another alternative would be to assign a hidden unique ID to each
>> geom entered into the DB. This could be a honking big integer, or maybe
>> a UUID.
>>
>> Paul Ramsey wrote:
>> > The problem is that the memcmp hit gets worse in exactly the cases
>> > were we expect better and better performance from the prepared
>> > algorithm... still, it would be nice to know what that hit is...
>> > compared to the original, unprepared time, it will be small, but
>> > compared to the prepared-with-id-API implementation... hard to say.
>> >
>> > Something to resolve before 1.4... It's unfortunate that all the
>> > *fast* tests can only falsify identity, not confirm it. I was talking
>> > to a fellow who has done a spatial db implementation on a proprietary
>> > system, and he was pleased with the idea of a "geographic hash" that
>> > he can calculate for each shape and use to test identity. If we were
>> > to do something like that, it would have to be optional, like the bbox
>> > calculation is currently.
>> >
>> > P.
>> >
>> > On Mon, Mar 31, 2008 at 2:51 AM, Mark Cave-Ayland
>> > <mark.cave-ayland at siriusit.co.uk> wrote:
>> >
>> >> On Friday 28 March 2008 23:53:53 Ben Jubb wrote:
>> >> > Howdy,
>> >> > In my testing, I did see a performance hit when using the memcmp test,
>> >> > although it was noticable only in the largest of my test geometries
>> >> > (5000 vertices or so).
>> >> > The three parameter form seemed like the best way to go because the
>> >> > whole point of the prepared version of the functions was to get the
>> best
>> >> > possible performance. The cases when the performance matters most is
>> >> > with large geoms, and then the cost of doing the memcmp is the
>> highest.
>> >> > Using a third argument seemed the simplest way to get the best
>> possible
>> >> > performance from the predicates, with a minimal increase in the
>> >> > complexity of the interface.
>> >> > I agree it would be nice to have a single form for those predicates
>> that
>> >> > automatically determines the most efficient manner to do the tests,
>> but
>> >> > there didn't seem to be any efficient way to accomplish that.
>> >> >
>> >> > b
>> >>
>> >>
>> >> Hi Ben,
>> >>
>> >> Well I think it really comes down to what exactly is the performance hit
>> and
>> >> how did you measure it? Which platform/OS/C library did you use?
>> Obviously
>> >> there will be *some* overhead having the extra memcmp() in there but
>> does it
>> >> matter? For example, if the overhead is just 1-2s on a 30s query then
>> that
>> >> doesn't really matter. Then again, if the overhead is 1s on a 3s query
>> then
>> >> that is significant.
>> >>
>> >> Since this is a new feature then I'd be inclined to say that for a first
>> cut
>> >> we should keep the standard API, and depending on the reports we get
>> back,
>> >> look at improving it later. That seems a lot more preferable to having a
>> >> fairly nasty API hack that will catch a lot of people out :(
>> >>
>> >>
>> >>
>> >> ATB,
>> >>
>> >> Mark.
>> >>
>> >> --
>> >> Mark Cave-Ayland
>> >> Sirius Corporation - The Open Source Experts
>> >> http://www.siriusit.co.uk
>> >> T: +44 870 608 0063
>> >> _______________________________________________
>> >> postgis-devel mailing list
>> >> postgis-devel at postgis.refractions.net
>> >> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>> >>
>> >>
>> > _______________________________________________
>> > postgis-devel mailing list
>> > postgis-devel at postgis.refractions.net
>> > http://postgis.refractions.net/mailman/listinfo/postgis-devel
>> >
>> >
>>
>> --
>> Martin Davis
>> Senior Technical Architect
>> Refractions Research, Inc.
>> (250) 383-3022
>>
>> _______________________________________________
>> postgis-devel mailing list
>> postgis-devel at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>>
>>
>> _______________________________________________
>> postgis-devel mailing list
>> postgis-devel at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>>
>>
>> _______________________________________________
>> postgis-devel mailing list
>> postgis-devel at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>>
>>
>>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20080401/6dd2ede9/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: benjubb.vcf
Type: text/x-vcard
Size: 255 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20080401/6dd2ede9/attachment.vcf>
More information about the postgis-devel
mailing list