[postgis-devel] Bug in default B-Tree operator class for geography

Paul Ramsey pramsey at cleverelephant.ca
Mon Sep 11 11:28:01 PDT 2017


Would you believe that ordering by z-order is actually very expensive?
Generating a good interleaved key for an arbitrary geometry turns out to be
quite expensive

- Is it not a point? Generate a point representation. So you're into
centroid code.
- Now you have a point, but it's a collection of double coordinates. To
interleave you need integers.
- OK, just multiply out the doubles so you can capture all the necessary
precision in the integer.
- Oh wait, you don't know what a good scaling factor is. For a geodetic
point, it'll be 10,000 or 100,000. For a projected point it might be 10 or
100. How do you know what factor to you? You're doing an SRID lookup
probably.
- OK, finally, you've calculated a centroid, and guessed a good scale
factor from the SRID, now you can calculate the interleaved key, and you've
got a uint64_t ready to go.

If you're using the zorder as your sort key, you'll be doing pair-wise
tests over and over and over again, so you'll be slamming through this
calculation above over and over again. The key really needs to be stored on
the serialization, like the bounding box, but now we are wasting storage
for things like points...

All the above is moot if you have a way to generate a zorder key from an
arbitrary geometry that works for geometries at all scales.

(You can skip the expensive centroid code by using a box centroid, that's
one. That still leaves figuring out a way to go from double->integer
without either stripping useful precision (always multiply by 100) or
flattening very disjoint doubles into the same space (just always use the
double mantissa). Calculating an interleaved key from two integers is also
surprisingly expensive.

P.

On Mon, Sep 11, 2017 at 10:19 AM, Paul Ramsey <pramsey at cleverelephant.ca>
wrote:

> You're cruel, cruel, cruel. (And you have a perverse mind to even
> *conceive* of the sorting-different-kinds-of-empty problem)
> Of course, yes, z-order is v. much something I want to do, but in this
> run, trying to stay ahead of Regina's release-monster, just avoiding bad
> index semantics for pgsql is the goal. The non-deterministic nature of
> different kinds of empty is one I hadn't considered, but I think has to be
> handled, yay! I'll do that now.
> P
>
> On Mon, Sep 11, 2017 at 7:37 AM, Darafei "Komяpa" Praliaskouski <
> me at komzpa.net> wrote:
>
>> What happens on LINESTRING EMPTY < POINT EMPTY?
>>
>> Also, ordering in geohash / z-curve would make many spots where it's used
>> a lot more efficient :)
>>
>> пн, 11 сент. 2017 г. в 17:31, Paul Ramsey <pramsey at cleverelephant.ca>:
>>
>>> I took a stab at it in https://trac.osgeo.org/postgis/ticket/3841
>>>
>>> Any spelunk into the b-tree side re-raises questions about "what do we
>>> mean by equal" and "how should we sort", making we want to tear off the
>>> bandages and try again.
>>>
>>> On Sun, Sep 10, 2017 at 5:05 PM, Peter Geoghegan <pg at bowt.ie> wrote:
>>>
>>>> Hi Paul,
>>>>
>>>> On Wed, Aug 24, 2016 at 8:25 AM, Paul Ramsey <pramsey at cleverelephant.ca>
>>>> wrote:
>>>> > Thanks for this Peter.
>>>> > Fortunately b-tree opclasses on geom/geog are but rarely used and
>>>> even more
>>>> > rarely build into actual indexes, since they don't provide any useful
>>>> > spatial searching capability.
>>>>
>>>> When do you expect to get around to fixing this? With amcheck in
>>>> Postgres 10, I would expect more people to notice this.
>>>>
>>>> Thanks
>>>> --
>>>> Peter Geoghegan
>>>>
>>>
>>> _______________________________________________
>>> postgis-devel mailing list
>>> postgis-devel at lists.osgeo.org
>>> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>>
>>
>> _______________________________________________
>> postgis-devel mailing list
>> postgis-devel at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20170911/ccbe04e1/attachment.html>


More information about the postgis-devel mailing list