<div dir="ltr"><div dir="ltr"><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Mar 4, 2019 at 12:08 PM Clemens Raffler <<a href="mailto:clemens.raffler@gmail.com">clemens.raffler@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Dear pgrouting dev team,<br>
<br>
I am hitting a problem with the pgr_withPoints function family and <br>
wanted to ask you for further guidance and/or advice on how to deal with <br>
it.<br>
<br>
First I would like to outline the task I am working on:<br>
I am calculating an origin destination matrix for two sets of points <br>
(eg. The start_points and end_points), so I am particularly interested <br>
in retrieving the cost between each startpoint and endpoint using the <br>
pgr_withPointsCost() function. As a cost factor I am normally using <br>
precalculated time-costs (seconds) as cost and reverse_cost depicting <br>
the (real) time a cyclist needs to traverse an edge (I will call those <br>
costs real_cost). In order to model a cyclists route choice more <br>
accurately I am introducing a multiplicator for costs on edges that seem <br>
unpleasant to ride on. Costs on those edges get multiplied by 100 in <br>
order to retrieve a more realistic routing output. As a result of this <br>
extra modelling, the pgr_withPointsCost() function aggregates costs <br>
based on the multiplied cost attributes and do not reflect the initial <br>
time (eg. real_cost) along the path properly.<br>
<br>
My approach to solve this problem would be to join and aggregate the <br>
real time costs (which is stored in the input graph table) costs along <br>
the manipulated routing output involves the use of pgr_withPoints() <br>
function. In more detail: This function allows me to first store the <br>
individual path elements of the routes in a result table. Then I would <br>
like to join the real cost based on the edge attributes of the routing <br>
output and the input graph table and group by start_pid and end_pid <br>
while applying sum(real_cost).<br>
<br>
But when changing the pgr_withPointsCost() function to pgr_withPoints() <br>
function I repeatedly run into some heavy memory leakage (different <br>
errors that seem to occur in the pgr_withPoints() function). I can give <br>
you some details on the tests I did:<br>
<br>
1) Testrun with create table as pgr_withPointsCost() using a graph <br>
with ~50000 edges, ~4000 Start and End pids: completes without errors <br>
(although 99% of memory is used).<br></blockquote><div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">4000 x 4000 = 16,000,000 rows <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">x (output size 2 BIGINT + 1 FLOAT = 24 Bytes), = <span class="gmail-cwcot gmail-gsrt" id="gmail-cwos">384,000,000</span> bytes to be kept in memory</span></span> </div><div><div style="font-family:arial,helvetica,sans-serif" class="gmail_default">They are a lot allready</div></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
2) Testrun with create table as pgr_withPoints() using a graph with <br>
~50000 edges, ~4000 Start and End pids: ERROR std::bad_alloc Hint: <br>
Working with directed Graph.<br></blockquote><div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">4000 x 4000 x (average number of edges in path say 1000) x (output size 2 INTEGERS + 4 BIGINT + 2 FLOAT = 56 Bytes), = <span class="gmail-cwcot gmail-gsrt" id="gmail-cwos">896,000,000,000</span> bytes to be kept in memory<br></span></div><div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">huge number, what is the size of your memory? <br></span></div><div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">and remember the computer memory is used in other stuff, like browsers etc</span></div><div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">Internally, from the results we have to convert to postgres memory, so add to that number</span></div><div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">like the half used for internal representation and at some point in time that is being used at least for the result + the graph data.<br></span></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
3) Testrun with create table as pgr_withPoints() using a graph with <br>
~50000 edges, 100 Start and ~4000 End pids: ERROR invalid memory alloc <br>
request size 3474796248 Where: SQL-Funktion »pgr_withpoints«<br></blockquote><div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">Looks like this is still huge for your computer<br></span></span></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>
4) Testrun with create table as pgr_withPoints() using a graph with <br>
~50000 edges, 10 Start and ~4000 End pids: completes without errors (50 <br>
sec)<br>
</div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">This looks like your computer can handle it</span></blockquote><div style="font-family:arial,helvetica,sans-serif" class="gmail_default">So if you really need the 4000 x 4000 in one go:</div><div style="font-family:arial,helvetica,sans-serif" class="gmail_default">Do as Steve mentions, do a commit after each call, and do the 400 calls needed</div><div style="font-family:arial,helvetica,sans-serif" class="gmail_default">of course it will take 400 * 50 secs to complete.</div><div style="font-family:arial,helvetica,sans-serif" class="gmail_default"></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"></span><br><div>
I checked the whole issues history of pgrouting on github and found <br>
tests with way more startpoints (around 80000, but also using <br>
pgr_withPointsCost() and way more RAM) – maybe it is related: <br>
<a href="https://github.com/pgRouting/pgrouting/issues/694#issuecomment-288035720" rel="noreferrer" target="_blank">https://github.com/pgRouting/pgrouting/issues/694#issuecomment-288035720</a><br>
<br>
Are you familiar with this kind of behaviour of pgr_withPoints(), which <br>
ultimately calls just pgr_dijkstra()? Is this a memory leak or do I <br>
just not have enough RAM (24GB)? Do you have any hints on how to solve <br>
this issue or do you experience similar problems? I would like to avoid <br>
cutting the query into smaller chuncs of start_points and iterate over <br>
them as it is very time inefficient to run such queries.<br>
<br>
I am currently running Postgresql Version 10.3, 64 bit, PostGIS v. 2.4 <br>
and pgrouting 2.6.0, release/2.6.1.59.0 on a Windows 10, 64bit Machine <br>
with 24GB of RAM. I will also try updating a test system to the current <br>
postgres, postgis and pgrouting versions and run the query again.<br>
<br>
I would be glad if you could have a look into this.<br>
<br>
Best regards,<br>
Clemens<br>
<br>
_______________________________________________<br>
pgrouting-dev mailing list<br>
<a href="mailto:pgrouting-dev@lists.osgeo.org" target="_blank">pgrouting-dev@lists.osgeo.org</a><br>
<a href="https://lists.osgeo.org/mailman/listinfo/pgrouting-dev" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/pgrouting-dev</a></div></blockquote></div><br clear="all"><br>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><pre>Georepublic UG (haftungsbeschränkt)
Salzmannstraße 44,
81739 München, Germany
Vicky Vergara
Operations Research
eMail: vicky@<a href="http://georepublic.de" target="_blank">georepublic.de</a>
Web: <a href="https://georepublic.info" target="_blank">https://georepublic.info</a>
Tel: +49 (089) 4161 7698-1
Fax: +49 (089) 4161 7698-9
Commercial register: Amtsgericht München, HRB 181428
CEO: Daniel Kastl
<span></span></pre></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div>