Update...<br><br>Well China did crash. Following from my theory that it is memory leak or other resource limitation from a large number of calls to pgRouting, I added code to cleanly close and reopen the connection every 300 points (ie. a max of 600 pgRouting calls). I also added a local garbage collection call and a 200ms sleep for good measure.<br>
<br>It appears to be working - China is still running (and should do for a week possibly), but it has run far longer than it did to the first crash or any of the Brazil crashes.<br><br>So there is a resource somewhere that is being "used up" when I have large numbers of queries on the same connection. When the connection closes, the resource in question is freed.<br>
<br>The stack is Psycopg -> Postgres -> pgRouting.<br><br>I found a reference to someone having a similar problem (no pgRouting, but lots of SELECTs through Psychopg), and their solution was to do a commit after every SELECT:<br>
<br><a href="http://stackoverflow.com/questions/4173168/psycopg-postgres-connections-hang-out-randomly">http://stackoverflow.com/questions/4173168/psycopg-postgres-connections-hang-out-randomly</a><br><br>I haven't found any other references to this kind of problem.<br>
<br><br>Richard<br><br><br><br><div class="gmail_quote">On Tue, Mar 1, 2011 at 10:29 AM, Richard Marsden <span dir="ltr"><<a href="mailto:winwaed@gmail.com">winwaed@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
Thanks for the refs & links - some useful stuff there...<br><br>Oh the wonders of Linux documentation - of course all the forks and versions don't help - the references I found to SHMMEM must have been old.<br>Thanks for those references. I've upped it to 128MB and shared_memory to a conservative 64MB (from 28MB) but the same result.<br>
(I'm also printing off those two Postgres pages about recommended configurations, whilst I type - I'll probably adjust them further. I have the Apress "From Beginner to Expert" PostGres book but of course it is cross-platform and is more interested in covering a wide range of topics including basic SQL)<br>
<br><br>After a few runs now, I'm seeing the abort occurs at different places in processing. Assuming that pgrouting's search through the graph is deterministic (and I haven't seen anything to say otherwise), this suggests the problem is not data (graph) specific.<br>
<br><br>I said "threads/processes": Python's implementation of multithreading is broken from the multi-core processing perspective - basically there's one giant lock on the interpreter! Luckily the standard libraries include an alternative which uses OS processes in a thread-like way. I'm using the multiprocessing 'Pool' functions to implement what Google have christened "MapReduce" across 1-3 cpus (haven't dared try 4 yet)<br>
The problem occurs with once process as well as 3 processes, so I don't think it is a bulk memory limit - instead it is a single process limit.<br><br>So far I've only seen the problem with Brazil (I'm batch processing country-wide mileage charts). This is the biggest chart I've tried (Australia was the previous largest and it wasn't that much smaller). <br>
One thought is that it could be related to the number of calls on a connection: perhaps a server-side garbage collector is not getting the chance to run? Or there's a memory leak?<br>The maximum number of pgRouting calls per connection is currently 2000. Brazil is going to be in the 950-1900ish range (I first try with a small delta. If that fails to find a route, I try a larger delta - hence the two-fold uncertainty)<br>
<br>I've now switched to Asia. If I'm right with the above paragraph, it will probably fail with India but not China<br>(China has lots of cities but a large number are not matched with road data, so they are skipped)<br>
So far it is running okay, but it has only reached Armenia...<br><br><br>Thanks for the suggestions - they are helping!<br><br>Richard<br></blockquote></div><br>