[postgis-devel] Postgis topology creation - O(n-squared)? - creates problems with large datasets.

Graeme B. Bell grb at skogoglandskap.no
Mon Jan 20 01:23:08 PST 2014

Good morning Sandro, 

> I've found the culprit!
> It's a query in topology._ST_AddFaceSplit that fails to make use of the index
> by checking for overlap between a literal and a function over the indexed
> geometry, in this form:

Fantastic work! Congratulations on finding it and fixing it. :-)
This will help to enable us to try a small real-world project here. 

>> BTW, I've checked with my boss and the 'topologish' files are free for use however you like, because they are just random data. Let me know if you want a larger sample later (e.g. 20M polygons) whenever the code is able to handle that many. 
> Thanks. I wouldn't know where to put them, but they are a really good 
> test dataset. Will they be stable at the url you gave ?

Let's say: the URLS in the previous post will be stable for at least the next 6 months.

Also, these new URLs (.../files/...) will keep working for the foreseeable future.    


I'm really happy that the files were useful. 

Incidentally - I made them with an open source package called 'rbuild' (confession: I'm the author). https://github.com/gbb/rbuild, if anyone is curious.
I'll wrap up the rbuild topology script neatly and put it on github, and post to the list when it's ready. 
That way, people can generate topologies of any complexity they like, in any zone they like. 

> For now I'm attaching an updated plot showing how using the index
> changed the curve. It also shows how TopoGeo_addPolygon performs
> better than ST_CreateTopoGeo, even if still running in a single
> transaction.  Row numbers (the lines with + are after the fix)
> Rows  | ST_CreateTopoGeo  | TopoGeo_addPolygon |

The asymptote on that curve is looking a lot more healthy :-)


More information about the postgis-devel mailing list