[postgis-devel] Bug in default B-Tree operator class for geography

Darafei "Komяpa" Praliaskouski me at komzpa.net
Mon Sep 11 13:09:23 PDT 2017


Hey,

 - get all the coordinates in the bounding box. may be better to have
float, not double bbox - two 32bit floats can be packed into single 64-bit
int;
 - cast them into integer from ieee754 representation - thus you get
(sign,exponent,fraction) tuple packed into integer.
for the syntax, have a look at pack_float in gist penalty.
 - treat 2D bbox as 4-dimensional point.
 - bit-interleave all the (x1, y1, x2, y2) dimensions, say by
morton_encode(morton_encode(x1, y1), morton_encode(x2,y2));
for reference,
https://github.com/gojuno/hexgrid-pgsql/blob/master/sql/morton.sql
 - if it were 2df box and you managed to pack it into 128bit of data, you
lost nothing. if you didn't, lose bits starting from the end.

PS: also, what will GEOMETRYCOLLECTION(LINESTRING EMPTY, POINT EMPTY) <
GEOMETRYCOLLECTION(POINT EMPTY, LINESTRING EMPTY) be?

пн, 11 сент. 2017 г. в 21:28, Paul Ramsey <pramsey at cleverelephant.ca>:

> Would you believe that ordering by z-order is actually very expensive?
> Generating a good interleaved key for an arbitrary geometry turns out to be
> quite expensive
>
> - Is it not a point? Generate a point representation. So you're into
> centroid code.
> - Now you have a point, but it's a collection of double coordinates. To
> interleave you need integers.
> - OK, just multiply out the doubles so you can capture all the necessary
> precision in the integer.
> - Oh wait, you don't know what a good scaling factor is. For a geodetic
> point, it'll be 10,000 or 100,000. For a projected point it might be 10 or
> 100. How do you know what factor to you? You're doing an SRID lookup
> probably.
> - OK, finally, you've calculated a centroid, and guessed a good scale
> factor from the SRID, now you can calculate the interleaved key, and you've
> got a uint64_t ready to go.
>
> If you're using the zorder as your sort key, you'll be doing pair-wise
> tests over and over and over again, so you'll be slamming through this
> calculation above over and over again. The key really needs to be stored on
> the serialization, like the bounding box, but now we are wasting storage
> for things like points...
>
> All the above is moot if you have a way to generate a zorder key from an
> arbitrary geometry that works for geometries at all scales.
>
> (You can skip the expensive centroid code by using a box centroid, that's
> one. That still leaves figuring out a way to go from double->integer
> without either stripping useful precision (always multiply by 100) or
> flattening very disjoint doubles into the same space (just always use the
> double mantissa). Calculating an interleaved key from two integers is also
> surprisingly expensive.
>
> P.
>
> On Mon, Sep 11, 2017 at 10:19 AM, Paul Ramsey <pramsey at cleverelephant.ca>
> wrote:
>
>> You're cruel, cruel, cruel. (And you have a perverse mind to even
>> *conceive* of the sorting-different-kinds-of-empty problem)
>> Of course, yes, z-order is v. much something I want to do, but in this
>> run, trying to stay ahead of Regina's release-monster, just avoiding bad
>> index semantics for pgsql is the goal. The non-deterministic nature of
>> different kinds of empty is one I hadn't considered, but I think has to be
>> handled, yay! I'll do that now.
>> P
>>
>> On Mon, Sep 11, 2017 at 7:37 AM, Darafei "Komяpa" Praliaskouski <
>> me at komzpa.net> wrote:
>>
>>> What happens on LINESTRING EMPTY < POINT EMPTY?
>>>
>>> Also, ordering in geohash / z-curve would make many spots where it's
>>> used a lot more efficient :)
>>>
>>> пн, 11 сент. 2017 г. в 17:31, Paul Ramsey <pramsey at cleverelephant.ca>:
>>>
>>>> I took a stab at it in https://trac.osgeo.org/postgis/ticket/3841
>>>>
>>>> Any spelunk into the b-tree side re-raises questions about "what do we
>>>> mean by equal" and "how should we sort", making we want to tear off the
>>>> bandages and try again.
>>>>
>>>> On Sun, Sep 10, 2017 at 5:05 PM, Peter Geoghegan <pg at bowt.ie> wrote:
>>>>
>>>>> Hi Paul,
>>>>>
>>>>> On Wed, Aug 24, 2016 at 8:25 AM, Paul Ramsey <
>>>>> pramsey at cleverelephant.ca> wrote:
>>>>> > Thanks for this Peter.
>>>>> > Fortunately b-tree opclasses on geom/geog are but rarely used and
>>>>> even more
>>>>> > rarely build into actual indexes, since they don't provide any useful
>>>>> > spatial searching capability.
>>>>>
>>>>> When do you expect to get around to fixing this? With amcheck in
>>>>> Postgres 10, I would expect more people to notice this.
>>>>>
>>>>> Thanks
>>>>> --
>>>>> Peter Geoghegan
>>>>>
>>>>
>>>> _______________________________________________
>>>> postgis-devel mailing list
>>>> postgis-devel at lists.osgeo.org
>>>> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>>>
>>>
>>> _______________________________________________
>>> postgis-devel mailing list
>>> postgis-devel at lists.osgeo.org
>>> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>>>
>>
>>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20170911/196ad282/attachment.html>


More information about the postgis-devel mailing list