[pgrouting-users] pgrouting performance on non-spatial dbpedia graph
Sebastian Kazenbroot-Guppy
sebastian.kg at gmail.com
Mon Sep 6 11:27:24 PDT 2021
Hello everyone,
I was posting on the Postgres Slack about the performance I was seeing when
routing through a ~20 million edge, ~7 million node dbpedia dump (where
only RDF triples from and to dbpedia resource pages are included). They
referred me to this mailing list to continue the discussion.
At the time, I was seeing `pgr_dijkstra` queries on this dataset take
around 8 minutes with a huge spike in memory use (to 4GB on this Windows 10
machine) -- I have since improved this to about 1.5 minutes (still with the
same memory use) after using `pgr_contraction` to simplify the graph where
possible and using the custom Dijkstra query outlined in the Postgres
documentation here
<https://docs.pgrouting.org/3.0/en/contraction-family.html>.
Is this the upper bound of performance improvement I can get without
filtering out data? I have tried adding GIN indexes to the
contracted-vertex arrays, and partial indexes to the is_new/is_contracted
fields, but at most the GIN indexes only helped a bit. I wonder about
something like pinning the graph, which is entirely static, in memory so
that no data transfer from disk to memory needs to occur. I notice that
websites like https://www.sixdegreesofwikipedia.com/ take only a fraction
of a second to query the same data AFAIK (they claim 6 million nodes
instead of just over 7), though of course I have read the open-source code
and see that is leveraging SQLite and application code. But since the
pathing logic happens local to the graph itself and not over the network,
and pgrouting uses an efficient graph library instead of application code,
I don't see this as a strict disadvantage for pgrouting.
I have also seen the documentation recommending "no more than 1 million
nodes" (https://docs.pgrouting.org/latest/en/pgRouting-concepts.html#id61),
so I understand if this is a hard limit. Still, I'd like to use Postgres if
possible to gain all of its other advantages when building out this project.
Thank you so much for any help anyone can provide, I'm very interested in
this problem and excited to hear the nitty, gritty details.
Best,
Sebastian Kazenbroot-Guppy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/pgrouting-users/attachments/20210906/067f72b0/attachment.html>
More information about the Pgrouting-users
mailing list