[postgis-devel] [SoC] GSoC 2021 - Week 3 Report - Implement pre-sorting methods before GiST index building

Paul Ramsey pramsey at cleverelephant.ca
Mon Jun 28 07:48:33 PDT 2021



> On Jun 27, 2021, at 8:55 AM, Darafei Komяpa Praliaskouski <me at komzpa.net> wrote:
> 
> Hi,
> 
> 
> On Sun, Jun 27, 2021 at 5:55 PM Han Wang <hanwgeek at gmail.com> wrote:
> Hi all,
> 
> I am here to share with you my Week3 report. You can also find it at [1]
> Coding Phase :
> 
> 	• Refactor the code structure, moving the hash function to the inline module
> 	• Create a pull request and receive suggestions from the community and mentors[2]
> 	• Finish the morton hash function
> I used `interleave` to implement a Morton(z-order) hash. What's more, I try to return the `x` of a `BOX2DF` as the hash order of the geometry object.
> What is confusing is that the Hilbert hash, the morton hash and the simple `x` show little difference in the performance test. So I think it is necessary to augment our test dataset and use some read world data.
> 
> 
>  - Try something broken from the start, like hash(x). Does it become worse? If not, something is terribly wrong in benchmark code.
> 
>  - To check the ordering of the index, you can create an index and then CLUSTER the table using that. Then ST_MakeLine(geom order by ctid) will create a line that follows the objects in the order they're in the index.

Cool, I hadn't thought of that trick :)

>  - I hope you're using at least a couple millions of objects. If not, take any OpenStreetMap country (I use Belarus usually), get it to the db (osmium-tool and its osmium export -f pg is good) and index that. 

Yes, you need enough data to get out of RAM for some of the tests. You could also suppress work_mem for some tests.

P.

> 
> 
>  
> 
> Plans for next week:
> 
> 	• Create larger random data or use real-world data for testing
> 	• Evaluate the stability and efficiency of hash functions
> 	• Searching for a more efficient hash function
> 
> If you have any questions or suggestions, please let me know.
> 
> [1] https://trac.osgeo.org/postgis/wiki/ImplementSortingMethodsBeforeGistIndexBuilding
> [2] https://github.com/postgis/postgis/pull/619
> 
> Best regards,
> Han
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel
> 
> 
> -- 
> Darafei "Komяpa" Praliaskouski
> OSM BY Team - http://openstreetmap.by/
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel



More information about the postgis-devel mailing list