<div dir="ltr"><div><div>Sadly I think it can't be done with pure plpgsql, <br>because every function is wrapped in a transaction no matter what.<br></div>You can only do it using the trick to connect to the same database from within with the extension "dblink"<br></div><div>But I find difficult to understand how transactions and sub transactions affects performance.<br></div><div><br></div><div>Morevover , the transaction think is not the only problem. It is more a design problem.<br>Even with CGAL, building topology one by one or with the batch mode changes radically the time of building ( n to n^2 at least).<br><br></div><div>That's why I truly think perf is going to come from a batch mode. Tweaking the current process is just damage control in my opinion.<br></div><div>This is not so hard do to if we rely a bit on GEOS.<br><br>1. cut the input geom into a space partition (for line, ST_Node, for poly ST_Polygonize)<br></div><div>2. populate node table, and create a temp table with list of line for each node<br></div><div>3. Populate edge_data<br></div><div>4. fill next / left for edge_data<br></div><div>5. compute area (Polygonize, Geos? )<br></div><div>6. Map the input geom to generated topology (to be able to use attributes)<br><br></div><div>I already tested 1,2,3,6.<br></div><div>It can be fast (not to that building full topo in geos and converting it to postgis_topology I'm afraid), and it will scale very well.<br></div><div><br></div><div>Cheers,<br></div>Rémi-C<br></div><div class="gmail_extra"><br><div class="gmail_quote">2014-11-20 12:24 GMT+01:00 Sandro Santilli <span dir="ltr"><<a href="mailto:strk@keybit.net" target="_blank">strk@keybit.net</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Wed, Nov 19, 2014 at 04:47:48PM +0100, Sandro Santilli wrote:<br>

> On Wed, Nov 19, 2014 at 12:50:09PM +0100, Rémi Cura wrote:<br>

> ><br>

> > Adding one feature is actually quite fast, even on already big topology.<br>

> ><br>

> > Its when you want to add a lot's that it becomes increasingly slow (maybe<br>

> > because indexes are not updated,or because we are in one transaction?)<br>

> ><br>

> > The slowing seems to be very non linear, probably following n^2, where n is<br>

> > the number of feature already constructed in the transaction.<br>

><br>

> An issue with index use was recently fixed.<br>

> There might be another one hiding somewhere.<br>

<br>

On a closer look, I'm thinking the single-transaction is what commonly<br>

hits during topology building (UPDATE .. SET tg = toTopoGeom ..)<br>

<br>

Starting from an empty topology and running a single statement<br>

invoking toTopoGeom for each of many inputs result in no stats ever<br>

being visible by the planner within the transaction. In turn this<br>

is likely to opt for sequencial scans (an empty table is quicker to<br>

scan sequencially).<br>

<br>

This would explain why populating in chunks works better, using<br>

a transaction for each chunk<br>

(UPDATE .. SET .. WHERE gid >= N AND gid < N+chunksize)<br>

<br>

It could be interesting to try a wrapper function taking care of<br>

running ANALYZE on the primitive tables every N calls to toTopoGeom<br>

(or N primitives being created, regardless of number of simple inputs).<br>

<br>

--strk;<br>

<br>

 ()  ASCII ribbon campaign  --  Keep it simple !<br>

 /\  <a href="http://strk.keybit.net/rants/ascii_mails.txt" target="_blank">http://strk.keybit.net/rants/ascii_mails.txt</a><br>

_______________________________________________<br>

postgis-users mailing list<br>

<a href="mailto:postgis-users@lists.osgeo.org">postgis-users@lists.osgeo.org</a><br>

<a href="http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users" target="_blank">http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users</a><br>

</blockquote></div><br></div>