[postgis-devel] Issues about the GSoC project

Imre Samu pella.samu at gmail.com
Tue Jul 27 01:42:10 PDT 2021


Hi Han,

> After checking the paper of GiST and implementation in Postgres.
> ...  And there are some results:
>  If you have any questions

Is this the latest version of Postgres?   ( master | PG14beta2 | PG13.3 )

comment:  The PostgreSQL14 - has some GIST improvements:
https://www.postgresql.org/docs/release/14.0/

Regards,
   Imre

Han Wang <hanwgeek at gmail.com> ezt írta (időpont: 2021. júl. 26., H, 15:21):

> Hi Giuseppe and Hi Regina,
>
> After checking the paper of GiST and implementation in Postgres. I think
> the query performance should be considered besides the building process. In
> the larger data test scenario, the building time of different indexes are
> similar because Postgres just hashes the tuples and sorts them and packs
> them into pages, building the tree index from bottom to up. With a bad hash
> order definition, the building process cannot detect the poor index query
> performance. So it is necessary to test the index query performance. I have
> tested the query performance with the `EXPLAIN` operator, using the sql
> scripts like other indexes in the `/regress`. But I am not familiar with
> PL/pgSQL, so I handle the log with some python scripts.
>
> In this test, I focus on the buffer hits and execution time of different
> tasks of different index types including `No Index`, `Simple GiST index`,
> `X hash function`, `morton hash function` and `hilbert hash function`.
> And there are some results:
> Shared buffer hits:
>  IndexCreate Time(ms)<<&<&&&>>>~=~@&<|<<||>>|&>
> 0 No Index 0 18 18 18 18 18 18 18 18 18 18 18 18
> 1 GiST Index 25.249 40237 46085 3009 40297 46025 3009 3009 3009 41994
> 45295 41934 45355
> 2 X PreSort Index 8.829 443620 441235 3009 444501 440354 3009 3009 3009
> 441568 442503 440687 443384
> 3 Morton PreSort Index 16.885 447779 447446 4079 448669 446556 4079 4079
> 4079 445362 449428 444472 450318
> 4 Hilbert PreSort Index 16.824 446714 444058 3558 447600 443172 3558 3558
> 3558 446072 445394 445186 446280
> Execution time:
>
> IndexCreate Time(ms)<<&<&&&>>>~=~@&<|<<||>>|&>
> 0 No Index 0 567.251 565.720 481.618 561.877 563.452 480.128 478.275
> 478.518 567.144 572.191 563.255 556.281
> 1 GiST Index 25.249 289.255 293.143 28.838 289.002 291.296 28.336 28.597
> 26.947 295.394 297.556 293.988 299.760
> 2 X PreSort Index 8.829 440.861 445.630 37.960 439.564 440.535 37.594
> 37.979 37.741 386.662 384.635 385.166 388.832
> 3 Morton PreSort Index 16.885 421.999 413.427 77.002 422.939 412.130
> 77.415 75.102 76.056 416.205 446.599 410.614 434.613
> 4 Hilbert PreSort Index 16.824 417.539 415.962 56.583 421.226 414.553
> 56.320 55.600 55.338 416.639 421.243 418.550 417.094
> The number of shared buffer hits are far bigger than the original one. But
> what confuses me is that the execution times are worse. I am trying to
> figure out why this happened.
> What's more, I am not very clear about the relationship between query
> performance and the number of shared buffer hits.
>
> If you have any questions or suggestions, please let me know.
>
> Best regards,
> Han
>
> On Thu, Jul 8, 2021 at 1:32 AM Giuseppe Broccolo <g.broccolo.7 at gmail.com>
> wrote:
>
>> Hi Han,
>>
>> Il giorno mer 7 lug 2021 alle ore 18:05 Han Wang <hanwgeek at gmail.com> ha
>> scritto:
>>
>>> Hi Regina and hi Giuseppe,
>>>
>>> Thanks for your reply!
>>>
>>> I am now checking the original paper and postgres's implement of gist.
>>> And now I think the query performance test after building gist index is
>>> necessary. Because as Darafei mentioned, the fast sorting building method
>>> may just pack the sorted tuples into pages regardless of which hash
>>> function it use. The correctness of index may be checked in the runtime
>>> query. At present, I am working on confirm the io access of hash functions.
>>>
>>> Feel free to give suggestions or questions.
>>>
>>
>> The correctness of the index should be covered by some of the regression
>> tests in action for the GiST support in PostGIS - for instance, kNN
>> searches etc. - for the moment I am really curious about the I/O access of
>> the hash functions. This is something maybe it has not been checked even in
>> PostgreSQL, and maybe it would be good to share the results with them as
>> well.
>>
>> Keep us updated, and thank you for your help!
>>
>> Giuseppe.
>>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20210727/ff6b2e8a/attachment-0001.html>


More information about the postgis-devel mailing list