[pgrouting-dev] Fwd: [RouterGeocoder] Routing Data Schema

Sat Apr 30 23:30:06 EDT 2011

Jay,

This might be of interest. I started a separate project a while back and 
you can browse the archives if you are interested. This is part of a 
design discussion I started and might give you some ideas.

-Steve

-------- Original Message --------
Subject: [RouterGeocoder] Routing Data Schema
Date: Mon, 30 Mar 2009 01:18:29 -0500
From: Stephen Woodbridge <woodbri at swoodbridge.com>
To: RouterGeocoder osgeo <routergeocoder at lists.osgeo.org>

Hi all,

I suggested that a potential SoC project might be building an abstract
schema for OpenRouter. I thought I might take this opportunity to better
define what I'm proposing and open it for input, comments, and
suggestions. Even if it is not appropriate for SoC, I think it has value
  regardless and I look forward to a good discussion on it.

Goal:

Define an abstract schema that can be used as an intermediate data store
for building router import tools. It is not expected that any router
would use the data in this form. The idea is that it would be easy for
people to write import tool into this schema, and that routing engines
could then import data from this schema.

Why bother with a two step process?

Routing engines, and there may be more than one, only need to write a
single import tool from this schema to benefit from many vendor specific
imports the load data into this schema. This means that instead of
needing M*N import tools, we only need M+N import tools.

The second reason is that this allows us to develop vendor specific
tools prior to or in parallel to router engine development.

pgRouting  -+             +- OSM import
OpenRouter -+    Open     +- Navteq import
...        -+<-- Data  <--+- TeleAtlas import
...        -+    Schema   +- ...
...        -+             +- ...

Other benefits:

There is some preprocessing of the data that can be achieved within this
  Data schema that can minimize the additional processing needed to
import it into a specific router's internal storage which will likely be
optimized for that routing engine.

Thought on Table Design:

My current thinking on this is a set of tables that could be represented
in a database at least for the purpose of prototyping the concept. Most
of the tables are normalized, except the ARC table. This could be
changed, but my thought was that this might be more useful in a
partially denormalized form.

ARC  - Table of road segments
---------------
  UID           - ARC_ID, Unique record ID for table
  DATASRC_ID    - reference to entry in DATASRC table
  SRC_ID        - Unique ID within the original source data or index num.
  X1            - X value of FROM end of segment
  Y1            - Y value of FROM end of segment
  Z1            - Zlevel of FROM end of segment
  X2            - X value of TO end of segment
  Y2            - Y value of TO end of segment
  Z2            - Zlevel of TO end of segment
  NODE1_ID      - NODE_ID for FROM end of segment
  NODE2_ID      - NODE_ID for TO end of segment
  COST          - COST to traverse segment FROM -> TO
  RCOST         - COST to traverse segment TO -> FROM
  LENGTH        - Length of segment in meters
  SPEED         - Average speed for segment
  ONEWAY        - Flag to indicate ONEWAY {B|F|T}
  HAS_XATTRIB   - Flag to indicate if this segment has XATTRIB records
  HAS_ARESTRICT - Flag to indicate if this segment has ARESTRICT records

DATASRC  - Table of data sources the ARCs were loaded from
---------------
  UID           - DATASRC_ID, Unique record ID for table
  SRC_TYPE      - Type of data source, this probably needs to be a well
                  defined set.
  SRC           - SRC identifier, like filename
  GEOM_COLUMN   - Column name for geometry column if a database
  OTHER         - SRC_TYPE specific information, like a database
                  connection string.

NODE  - List of unique nodes in the model
---------------
  UID           - NODE_ID, Unique record ID for table
  X             - X value for node
  Y             - Y value for node
  Z             - Zlevel for node

XATTRIB  - List of extended attributes associate with any given ARC
---------------
  UID           - UID, Unique record id for table
  ARC_ID        - ARC_ID that this record belongs to
  SEQNO         - Sequence number in case attributes need order
  KEY           - Type of attribute
  VALUE         - Value of attribute

ARESTRICT  - List of Restrictions associated with any given ARC
---------------
  UID           - UID, Unique record id for table
  ARC_ID        - ARC_ID that this record belongs to
  TIME_FROM     - Start time for restriction, DATE-TIME or TIME
  TIME_TO       - End time for restriction, DATE-TIME or TIME
  VEHICLE_CLASS - Vehicle class that this restriction belongs to
  KEY           - Type of restriction
  VALUE         - Value of restriction

NRESTRICT  - List of Restrictions associated with any given NODE
---------------
  UID           - UID, Unique record id for table
  NODE_ID       - NODE_ID that this record belongs to
  TIME_FROM     - Start time for restriction, DATE-TIME or TIME
  TIME_TO       - End time for restriction, DATE-TIME or TIME
  VEHICLE_CLASS - Vehicle class that this restriction belongs to
  KEY           - Type of restriction
  VALUE         - Value of restriction

Some Notes and thoughts on this:

In general, things like KEY values need to be a well defined set of
values and should be validated on input.

1) For bicycling and heavy truck hauling we might want to have altitude
loss or gain or grade. This can probably be held in the XATTRIB table.

2) We might want to have a METADATA table with key/value pairs to
indicate what type of restriction and/or attributes the data set
includes. This should also include a VERSION key with an appropriate
value indicating the schema version number.

3) ARESTRICT could be used to indicate, NO TRUCKS, Weight restrictions,
PEDESTRIANS ONLY, 4 Wheel drive only, etc. These can then be used to
eliminate ARCs that might not be appropriate when importing into the
router or to add additional information to the ARCs in the router
depending on features and capabilities of the engine.

4) NRESTRICT could be used to store turn restrictions or any other
restrictions that might need to be applied to a route passing through
this NODE. KEY indicates the type of restriction, and VALUE for TURN
restrictions, would be a list of CHILDREN_IDs, ":", list of
ANCESTOR_IDs. And CHILDREN would be the potential CHILDREN that would be
restricted if the path came from the listed ancestors

5) This does not include information that would be needed for route
explication. This should probably be added. This also does not copy the
ARC geometry to a local table.

6) I'm not sure we need to keep DATASRC table or not. If we keep a local
copy of all the needed data then we might not. There might be some value
if we wanted to extend this to support incremental updates, for example
if a whole state or country was a single data source you could
potentially purge all the ARCs associated with it, and any referential
data records, then import that data again without having to reload the
whole thing.

7) I have not really thought through signage yet. Navteq data has
information about Signs along the route that can be useful during
explication. These may fit in the XATTRIB, but I need to check this and
how you determine which sign(s) are used for which MANEUVER.

8) I also think we can define pseudo-code process of how to load the
data like:
   a) open a data source
   b) insert a record into the DATASRC table
   c) foreach record in the data source
   d)    insert a record into the ARC
   e)    add any required XATTRIB or ARESTRICT records to each ARC
   f) generate the NODE table from the ARC table
   g)    and update the ARCs with NODE_IDs
   h) Add Turn restrictions to the NRESTRICT table
   i) Run sanity checking for:
      1. zero or negative cost/length segments
      2. isolated single segments
      3. other TBD

Sorry it is late here and I can not think through this anymore tonight.
This is a good start and I hope seeing the data laid out like this helps
to show how a simple schema might be useful and that having an
intermediate schema like this might be of value and not too overly
complex to implement.

Questions, Concerns, Issues, Missing information?

Best regards,
   -Steve
_______________________________________________
Routergeocoder mailing list
Routergeocoder at lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/routergeocoder