[postgis-devel] Issues about the GSoC project

Han Wang hanwgeek at gmail.com
Tue Jul 27 09:10:57 PDT 2021


Hi Imre,

Thanks for your information!

Currently, I am using the branch directly from github source days ago.
Maybe it is a `14dev` version.
And I will check your link immediately.

Best regards,
Han

On Tue, Jul 27, 2021 at 4:43 PM Imre Samu <pella.samu at gmail.com> wrote:

> Hi Han,
>
> > After checking the paper of GiST and implementation in Postgres.
> > ...  And there are some results:
> >  If you have any questions
>
> Is this the latest version of Postgres?   ( master | PG14beta2 | PG13.3 )
>
> comment:  The PostgreSQL14 - has some GIST improvements:
> https://www.postgresql.org/docs/release/14.0/
>
> Regards,
>    Imre
>
> Han Wang <hanwgeek at gmail.com> ezt írta (időpont: 2021. júl. 26., H,
> 15:21):
>
>> Hi Giuseppe and Hi Regina,
>>
>> After checking the paper of GiST and implementation in Postgres. I think
>> the query performance should be considered besides the building process. In
>> the larger data test scenario, the building time of different indexes are
>> similar because Postgres just hashes the tuples and sorts them and packs
>> them into pages, building the tree index from bottom to up. With a bad hash
>> order definition, the building process cannot detect the poor index query
>> performance. So it is necessary to test the index query performance. I have
>> tested the query performance with the `EXPLAIN` operator, using the sql
>> scripts like other indexes in the `/regress`. But I am not familiar with
>> PL/pgSQL, so I handle the log with some python scripts.
>>
>> In this test, I focus on the buffer hits and execution time of different
>> tasks of different index types including `No Index`, `Simple GiST index`,
>> `X hash function`, `morton hash function` and `hilbert hash function`.
>> And there are some results:
>> Shared buffer hits:
>>  IndexCreate Time(ms)<<&<&&&>>>~=~@&<|<<||>>|&>
>> 0 No Index 0 18 18 18 18 18 18 18 18 18 18 18 18
>> 1 GiST Index 25.249 40237 46085 3009 40297 46025 3009 3009 3009 41994
>> 45295 41934 45355
>> 2 X PreSort Index 8.829 443620 441235 3009 444501 440354 3009 3009 3009
>> 441568 442503 440687 443384
>> 3 Morton PreSort Index 16.885 447779 447446 4079 448669 446556 4079 4079
>> 4079 445362 449428 444472 450318
>> 4 Hilbert PreSort Index 16.824 446714 444058 3558 447600 443172 3558 3558
>> 3558 446072 445394 445186 446280
>> Execution time:
>>
>> IndexCreate Time(ms)<<&<&&&>>>~=~@&<|<<||>>|&>
>> 0 No Index 0 567.251 565.720 481.618 561.877 563.452 480.128 478.275
>> 478.518 567.144 572.191 563.255 556.281
>> 1 GiST Index 25.249 289.255 293.143 28.838 289.002 291.296 28.336 28.597
>> 26.947 295.394 297.556 293.988 299.760
>> 2 X PreSort Index 8.829 440.861 445.630 37.960 439.564 440.535 37.594
>> 37.979 37.741 386.662 384.635 385.166 388.832
>> 3 Morton PreSort Index 16.885 421.999 413.427 77.002 422.939 412.130
>> 77.415 75.102 76.056 416.205 446.599 410.614 434.613
>> 4 Hilbert PreSort Index 16.824 417.539 415.962 56.583 421.226 414.553
>> 56.320 55.600 55.338 416.639 421.243 418.550 417.094
>> The number of shared buffer hits are far bigger than the original one.
>> But what confuses me is that the execution times are worse. I am trying to
>> figure out why this happened.
>> What's more, I am not very clear about the relationship between query
>> performance and the number of shared buffer hits.
>>
>> If you have any questions or suggestions, please let me know.
>>
>> Best regards,
>> Han
>>
>> On Thu, Jul 8, 2021 at 1:32 AM Giuseppe Broccolo <g.broccolo.7 at gmail.com>
>> wrote:
>>
>>> Hi Han,
>>>
>>> Il giorno mer 7 lug 2021 alle ore 18:05 Han Wang <hanwgeek at gmail.com>
>>> ha scritto:
>>>
>>>> Hi Regina and hi Giuseppe,
>>>>
>>>> Thanks for your reply!
>>>>
>>>> I am now checking the original paper and postgres's implement of gist.
>>>> And now I think the query performance test after building gist index is
>>>> necessary. Because as Darafei mentioned, the fast sorting building method
>>>> may just pack the sorted tuples into pages regardless of which hash
>>>> function it use. The correctness of index may be checked in the runtime
>>>> query. At present, I am working on confirm the io access of hash functions.
>>>>
>>>> Feel free to give suggestions or questions.
>>>>
>>>
>>> The correctness of the index should be covered by some of the regression
>>> tests in action for the GiST support in PostGIS - for instance, kNN
>>> searches etc. - for the moment I am really curious about the I/O access of
>>> the hash functions. This is something maybe it has not been checked even in
>>> PostgreSQL, and maybe it would be good to share the results with them as
>>> well.
>>>
>>> Keep us updated, and thank you for your help!
>>>
>>> Giuseppe.
>>>
>> _______________________________________________
>> postgis-devel mailing list
>> postgis-devel at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20210728/2138c4b9/attachment-0001.html>


More information about the postgis-devel mailing list