[pgrouting-dev] integer vs bigint for edge and node ids

Luís de Sousa luis.a.de.sousa at gmail.com
Wed Mar 18 11:40:43 PDT 2015


Hi Steve,

Has there been any GSOC campaign with pgRouting? That would be the
place to start IMHO. After that would be a call for BSc/MSc thesis;
even if there is no budget, students may become interested in the
project.

The main difficulty with the problems you describe is that they are
are not easy to tackle concurrently. The first step would have to be
the refactoring. It could start with a first task to refactor just two
of modules and extend it to the others on a second step. The bigint
issue would be the last to tackle.

Just my thoughts. Regards,

Luís

On 15 March 2015 at 00:46, Stephen Woodbridge <woodbri at swoodbridge.com> wrote:
> Developers,
>
> This topic came up off list and I thought this should be part of a more
> public discussion, so here are my thoughts on it. I think both the ideas
> below are worth pursuing but are huge tasks that are currently not funded
> and I'm concerned that we should not tackle them on a piecemeal, ad-hoc
> basis.
>
> Problem:
>
> postgis loads data with a gid column that is typed bigint.
> OSM data has node and edge ids that can not be held in an integer field.
> Today, all pgrouting code assumes ids are integer and larger numbers
> overflow the field causing problems.
>
> So the obvious answer is to update the code to be bigint compatible.
>
> The issues related to doing this are as follows:
>
> * pgrouting has the bad habit of using ids as indexes into arrays which is a
> problem. The code partially minimizes the indexing issue by located the
> minimum id and subtracting that from all other ids shifting the range to
> eliminate the leading empty ids, but if the (max_id - min_id) is large and
> the ids are sparsely allocated across the range we still need to allocate
> (max_id - min_id + 1) elements in our storage structures. For example if you
> have two ids '2' and '1,000,002' the code allocates 1,000,001 elements. This
> can obviously be fixed with some additional coding.
>
> * the code would/might need to be changed to such that the boost structures
> can support bigint values and this will double the amount of storage needed
> for ids. There is also the possibility that if the point above does
> renumbering and we limit the number of ids in any given request to unsigned
> integer that the code might not need changes and we would only limit the
> size of the problem we can solve and max unsigned integer sized problem is
> still larger than anyone would want to wait for a solution.
>
> * this would also imply that the output data types we use would get
> converted to bigint also.
>
> * If we make this change, we need to do it to all our algorithms so we keep
> the commands consistent and compatible with one another. This would involve
> updating all of the follow algorithms in pgrouting:
>
> apsp_johnson
> apsp_warshall
> astar
> bd_astar
> bd_dijkstra
> common
> dijkstra
> driving_distance
> kdijkstra
> ksp
> shooting_star ** deprecated, will be deleted **
> trsp
> tsp
> vrp_basic
> vrpdptw
>
>
> While this does not have anything to do with this specific problem, we need
> to consider any major reworking of the code like this would involve with the
> fact for historical evolution of the code base we have copied source code
> from one algorithm to another so we have a lot of duplicated code. It would
> make a lot of sense at this point to refactor some of this code into common
> set of reusable C++ base classes. To put this into context all of the follow
> code is in someway based off of the original dijkstra code:
>   astar
>   bd_astar
>   bd_dijkstra
>   dijkstra
>   driving_distance
>   kdijkstra
>   ksp
> So every time we make a change to one of these algorithms, there is a good
> chance that the problem exists in one or more of the other ones. We have
> been struggling with this problem for some time.
>
> I don't want to blowup this problem into something that is so big it will
> never get done, but I do think it is important that we think about these
> issues and that we put together a plan that takes these issues into account
> in the planning of these and other potential changes.
>
> Thoughts,
>   -Steve
> _______________________________________________
> pgrouting-dev mailing list
> pgrouting-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/pgrouting-dev


More information about the pgrouting-dev mailing list