[postgis-devel] Postgis topology creation - O(n-squared)? - creates problems with large datasets.
Sandro Santilli
strk at keybit.net
Tue Jan 14 10:32:55 PST 2014
On Tue, Jan 14, 2014 at 02:52:18PM +0000, Graeme B. Bell wrote:
> >
> > I did up to 160000, after which I stopped because it took ~10 hours,
> > and I've been running twice: once with ST_CreateTopoGeo and once
> > with TopoGeo_addPolygon.
>
>
> Sandro,
>
> Thank you for your excellent work in reproducing the situation and beginning the investigation. I am really curious to see what else you discover.
>
> BTW, I've checked with my boss and the 'topologish' files are free for use however you like, because they are just random data. Let me know if you want a larger sample later (e.g. 20M polygons) whenever the code is able to handle that many.
Thanks. I wouldn't know where to put them, but they are a really good
test dataset. Will they be stable at the url you gave ?
For now I'm attaching an updated plot showing how using the index
changed the curve. It also shows how TopoGeo_addPolygon performs
better than ST_CreateTopoGeo, even if still running in a single
transaction. Row numbers (the lines with + are after the fix)
Rows | ST_CreateTopoGeo | TopoGeo_addPolygon |
------+-------------------+--------------------|
5000 | 129427.274 | 116104.691 |
+ | 105686.370 | 92694.196 |
10000 | 327703.598 | 301611.556 |
+ | 226239.651 | 198670.907 |
20000 | 936224.702 | 884240.862 |
+ | 526605.369 | 476830.898 |
40000 | 2976178.984 | 2891650.767 |
+ | 1343630.221 | 1134592.873 |
80000 | 10343526.099 | 10254117.319 |
+ | 3775123.534 | 3680183.557 |
160000 | 38352283.535 | 38375126.950 |
+ | 12088554.776 | 9644736.277 |
--strk;
-------------- next part --------------
A non-text attachment was scrubbed...
Name: topotime2.png
Type: image/png
Size: 34407 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20140114/26a68284/attachment.png>
More information about the postgis-devel
mailing list