[postgis-devel] Prepared Geometry API

Martin Davis mbdavis at refractions.net
Mon Oct 6 11:03:02 PDT 2008


Some comments:

- The whole point of the geometry cache key is that it checks EXACT 
identity.  Would you really trust a hash/CRC to tell you that two 
(potentially very large and differing only very slightly) geometries are 
different?  I'm not sure I would....   This is a database, after all - I 
think there's an expectation that it will return precise, correct answers. 

- re Mark's comment that "memcmp exits as soon as it detects a 
difference".  In other words, cache misses can be cheap.  True enough, 
but the whole point of using PreparedGeometry is that there is an 
expectation that the majority of the tests made against the cache will 
result in *hits*.  Suppose you have a situation where you are comparing  
M geoms against N geoms.  You'll be accessing the cache MN times, but 
you will only get a cache miss M times.  For large M and N this 
essentially means that every cache check is a hit. 

Both of the above are really different aspects of the same situation.  
Methods such as CRC can determine quickly whether two objects are 
different.  Sometimes that exactly what you want, because you don't mind 
paying a price when you need to check equality.  But PrepGeom is exactly 
the opposite - checking equality is the common case.

Mark Cave-Ayland wrote:
> Obe, Regina wrote:
>
>> Ah okay. Yes I'm in agreement with you on this one.  Introducing an
>> index key to use prepared geometry will be annoying and I need it mostly
>> in subselects. I suppose we can't figure out some way to dynamically
>> define an index by some hash algorithm, or are we doing that already? 
>
> It's not so much an index, just a unique identifier for each geometry 
> that can be used to determine whether it is already in the prepared 
> cache. At the moment, synthetic keys are used with an extended API so 
> as to provide a direct key into the cache. I'm wondering if we could 
> use something else such as a CRC32 (assuming the PostgreSQL hash 
> implementation handles collisions using memcmp() internally).
>
> *thinks*... maybe GEOS should generate a CRC32 hash key as part of the 
> creation of the prepared geometry? Assuming we could access this using 
> a the GEOS CAPI, it would just be a case of handling the few 
> collisions using memcmp()...
>
>> I didn't understand most of what Paul said so take my affirmation and
>> comments with a grain of salt. 
>
> At the end of the day, if there is compelling enough use case for new 
> APIs to implement this, then perhaps we should consider it. My main 
> concern at the moment is the lack of evidence for justifying them.
>
>
> ATB,
>
> Mark.
>

-- 
Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022




More information about the postgis-devel mailing list