[RouterGeocoder] Re: Some thoughts on OpenRouter

Tue Apr 7 17:53:01 EDT 2009

Hi Roni,

Ashraf Hossain wrote:
> Hi Steve,
> I was in a vacation so I replied you after a long long day.

Great, I hope you had a nice relaxing time.

> First of all my thinking is almost same as your planning about
> openrouter library.
> It will be part of GDAL/OGR.

I think we will use GDAL/OGR in the graph builder to access files, but 
we will not be part of GDAL/OGR. We will include GDAL/OGR in our 
project, but they will not include OpenRouter in their project unless 
they want to.

> But I have some queries.

Questions are good and you seem to be asking a lot of good questions below.

> 1. In a multipart polyline feature is every part connected or there is
> some possibility of disconnect? Because when openrouter library will
> build the connectivity database then we have to keep in mind this.

It is possible the multipart features are totally disconnected or they 
may be connected in strange ways like a node at the center and lots of 
spokes. We can not make assumptions about topology in in multipart 
polylines. So I think we need to read them and we should assume 
multipart features only have one part and throw an error if that is not 
the case. Lets keep it simple for now and we can expand it later if 
there is a use case and a need for more.

> 2. Should the graph independent from the shape?
> --> It may have several solution.
>    a. If it is independent then the data size will be huge in graph.
>    b. If not then we can we can solve this in 2 way.
>         *. As you explained in this mail that keeping the memory location
>         *. As Shape file database system(*.bdf file). In shape for
> every consecutive feature index we have related information in dbf
> file.
>             We can do such kind of things here (I will explain it
> later part of this mail).

I think that we probably want two separate files that are linked by ids 
or indexes.

1) the graph files which contains the stuff we need to instantiate a 
graph at runtime and solve it.
2) the shape-info file that is needed in pre/post processing.

The idea would be to try and keep the runtime graph small so there is 
less I/O while solving it. Once we have a solution, we will need to 
access the edges/nodes in the shape information file.

> 3. If the user set a source and destination with specific latitude and
> longitude should we use the spatial index tree of a layer in
> mapserver?

We can use an OpenSource rtree index to search the shape-info file, 
which I think will be much faster than the mapserver spatial index, and 
get the edge/vertices's of the closest object to the lat-lon. I do not 
think we need to be dependent on mapserver. When we have a working 
library, someone might want to link that to mapserver to add routing to 
it but their are likely better ways to interconnect them.

> 4. How the optimization parameter values come from? From the dbf file?

If you mean things like oneway flags, zlevels, costs, avg. speed, 
turn-restrictions, etc: then yes some of them will come from the DBF, 
some may be computed, like length, and some may come from other 
files/tables that are linked by IDs.

My thought is that we define what we need for routing and define the 
intermediate abstract routing graph structures first and build the 
routing engine that reads that graph.

Then we build tools that can translate data sources into the 
intermediate graph. I think we will have multiple import tools like:

1) a simple shapefile import
2) a database import tool
3) a Navteq shapefile import tool
4) a TeleAtlas shapefile import tool
5) other tools as required

1. and 2. might have an XML file that describes the DBF structure and 
has options for how to build the graph. This will probably only work for 
relatively simple shapefile structures.

While 3. and 4. have a much richer data model and may need a more 
complex graph builder.

> 5. Should we concentrate on the support of as may layer type (suppose
> shape file,database...)  as in mapserver else we shold concentrate on
> the functionalities of the library as much as I can?

I think we should probably start with 1. and 2. above. Again we should 
keep it simple to start, make sure it works, and that it meets our needs 
for development and testing. Later we can add more modules and extend 
the existing ones to be more flexible and read more data formats. I 
think it would be good if we can load:

1) pgRouting database tables
2) multiple shapefiles in a common format
3) maybe work with Nestor to be able to load OSM data

> Now My planning about openrouter library in some little details.
> 
> *. The mapserver indexing tree will be used and the shape file feature
> also be used.
> The router geocoder will make a connectivity and node database (graph).

We need to define this data structure and talk with Daniel and Anton 
that have worked extensively with building pgRouting. One thing I did in 
C when I built a prototype routing engine was built the graph using 
memmap so I could open the graph file and map it into memory and the OS 
handled the file paging needs. I'm not sure if we want to do that or not 
with C++.

> Suppose we built the node database ; The connectivity database is
> nothing but a node index list where the node index will stored
> accoring to the polyline feature's points (actually first point of the
> polyline feature and the last point of the polyline feature the middle
> points are not the part of the graph).

If we build it as to files that are linked by common IDs or indexes, 
then you can take the lat-lon map it through the rtree index and find 
the clostest object and nodes. Then pass the start and stop info to the 
router and it can use that for solving the graph.

So I think we need to define an Classes, the public API, file 
structures, etc and review that. Then we should be able to work from 
that and it should be very clear.

> If one user set any source point then we will find the closest feature
> using the indexing tree and then we will get the feature index of that
> feature in the feature list then we will directly access the node from
> the connectivity database which will be stored in the same index for
> that feature. And for the destination point we will apply the same
> technoque. Then we alredy got the source node and destination node and
> we alredy have the graph we can apply several algorithm for generating
> shortest paths.

Yes this is the plan :)

> This is my primary idea. I need to mature this with consecutive
> discussion along with you first then I will let know others about
> this.

This is fine. I'm sure we will get lots of good input from Anton, Daniel 
and Andrew that are on the list and hopefully from others as well.

I think our plan should be based on "keep it simple" to start but "plan 
for expansion". The key to this approach is to make things modular and 
layered. Also, we want to keep our dependencies down so we should 
discuss what libraries we want to use. I'm pretty sure we will probably 
want to use the boost libraries if we are doing this in C++.

> And at last, I am not too natural in english so I think you can
> understand me well.

Your english is very understandable.

Best regards,
   -Steve

> With Regards
> Roni
> 
> 
> 
> 
> 
> 
> On Sun, Apr 5, 2009 at 9:13 AM, Stephen Woodbridge
> <woodbri at swoodbridge.com> wrote:
>> Hi all,
>>
>> Here are some of my initial thoughts on the OpenRouter. This is an attempt
>> to get a lot of ideas that have been floating around in my head and to put
>> some structure to them. It is going to need a lot of additional work and
>> input. I also do not see this all getting built all at once. I imagine that
>> we will tackle bits and pieces at a time with other pieces being stubbed
>> out, simplified or may be hard-coded initially, and that we will refactor
>> those parts latter as we build them out. I see initially that there are
>> three major areas:
>>
>> Graph Builder
>> -------------
>> 1) need requirements for non-transportation analysis?
>> 2) modular data source readers
>>   a) file formats (GDAL/OGR)
>>   b) metadata to describe information structure
>> 3) modular data store I/O
>> 4) data store schema
>>
>> The graph builder is a tool(s) that builds the graph used by the router
>> library. This allows for any required preprocessing to be done here once.
>> Since the router will likely have plug-in solvers and different network
>> analysis that might require different graph structures, we may need to think
>> about plug-in builders that service a family of solvers. While I'm familiar
>> with transportation network analysis, there has been a desire expressed to
>> not limit this to traffic it networks. I see no need to limit this, but
>> input and expertise will need to be found to help identify any additional
>> requires that these separate analysis require. Hopefully using a plug-in
>> architecture will make it easy to support these as they are needed.
>>
>> The generic tasks here involve reading various physical file formats and
>> (GDAL/OGR) is a good candidate for that, but we also need to understand the
>> content of the data files and how it relates to the task of building the
>> graph. This might be via an XML document describing the data source. We
>> might want to have an optional GUI tool that can explore the data and
>> collect information from the user and then write the XML document and
>> trigger the builder to run. I can envision a given builder plug-in calling
>> standard APIs for reading and writing data. We need to create a logical
>> graph and store it in some physical stores. The logical graph may need to
>> keep a table of back pointers to the original data if it is not copied
>> forward. For example and network graph needs little or not geometry to
>> produce a solution, but the geometry is needed to create a route polyline
>> and/or driving directions. The physical stores might be a packed data file
>> that gets memory mapped into internal data structures, or it might be a
>> database blob, or a set of database tables depending on the application
>> environment's requirements.
>>
>> OpenRouter Library
>> -------------------
>> 1) general network analysis library with plug-in solvers
>> 2) layered access to data to separate physical and logical data stores
>> 3) plug-in post-solution processors
>>   a) create isochronal polygons
>>   b) explicate driving directions
>>   c) format results as XML, JSON, CSV, text, etc
>>   d) save solution as ...
>>   e) ...
>> 4) ...
>>
>> This is the core analysis library, that will get initialized and open a
>> graph builder graph, then take take some input and solve the requested
>> problem. After the solution, a chain of post processors can be called on the
>> solved graph as required and finally cleanup routines are called to release
>> system resources.
>>
>> Applications using the library
>> ------------------------------
>> 1) commandline tool
>> 2) debugging - graph anaysis query tool
>> 3) webservice instance (using OpenLS?)
>> 4) embed in other applications
>> 5) embed in Databases
>> 6) SWIG library interfaces to Perl, Python, PHP, C#, etc
>> 7) ...
>>
>> Finally some applications are build around the library. I see a commandline
>> tool and a debugging - graph analysis query tool as usual tools for
>> developing the library, testing the library, etc. A webservice tool would
>> make this immediately useful to the user community for driving directions in
>> a AJAX web 2.0 environment like OpenLayers applications. SWIG interfaces
>> would help extent the reach of the library into other communities.
>>
>> So my goal in this document is to try and paint in broad strokes an outline
>> of where I think we want to get to. The purpose for this is that I hope it
>> will allow developers to see down the road so when they are making decisions
>> about implementation they can better choose options that will need less work
>> to refactor as we tackle the next requirement. This is just me thinking out
>> loud and I'm open to anyone that want to build upon this. It would be great
>> if people want to take any chunks of this and drill down into a more
>> complete analysis, design and/or code.
>>
>> Misc. things to do:
>> 1) look at Boost Graph (yet again)
>> 2) look at how grass handles graph building/solving
>>
>> Thoughts?
>>
>> -Steve
>>