[RouterGeocoder] Routing Data Schema
Stephen Woodbridge
woodbri at swoodbridge.com
Mon Mar 30 02:18:29 EDT 2009
Hi all,
I suggested that a potential SoC project might be building an abstract
schema for OpenRouter. I thought I might take this opportunity to better
define what I'm proposing and open it for input, comments, and
suggestions. Even if it is not appropriate for SoC, I think it has value
regardless and I look forward to a good discussion on it.
Goal:
Define an abstract schema that can be used as an intermediate data store
for building router import tools. It is not expected that any router
would use the data in this form. The idea is that it would be easy for
people to write import tool into this schema, and that routing engines
could then import data from this schema.
Why bother with a two step process?
Routing engines, and there may be more than one, only need to write a
single import tool from this schema to benefit from many vendor specific
imports the load data into this schema. This means that instead of
needing M*N import tools, we only need M+N import tools.
The second reason is that this allows us to develop vendor specific
tools prior to or in parallel to router engine development.
pgRouting -+ +- OSM import
OpenRouter -+ Open +- Navteq import
... -+<-- Data <--+- TeleAtlas import
... -+ Schema +- ...
... -+ +- ...
Other benefits:
There is some preprocessing of the data that can be achieved within this
Data schema that can minimize the additional processing needed to
import it into a specific router's internal storage which will likely be
optimized for that routing engine.
Thought on Table Design:
My current thinking on this is a set of tables that could be represented
in a database at least for the purpose of prototyping the concept. Most
of the tables are normalized, except the ARC table. This could be
changed, but my thought was that this might be more useful in a
partially denormalized form.
ARC - Table of road segments
---------------
UID - ARC_ID, Unique record ID for table
DATASRC_ID - reference to entry in DATASRC table
SRC_ID - Unique ID within the original source data or index num.
X1 - X value of FROM end of segment
Y1 - Y value of FROM end of segment
Z1 - Zlevel of FROM end of segment
X2 - X value of TO end of segment
Y2 - Y value of TO end of segment
Z2 - Zlevel of TO end of segment
NODE1_ID - NODE_ID for FROM end of segment
NODE2_ID - NODE_ID for TO end of segment
COST - COST to traverse segment FROM -> TO
RCOST - COST to traverse segment TO -> FROM
LENGTH - Length of segment in meters
SPEED - Average speed for segment
ONEWAY - Flag to indicate ONEWAY {B|F|T}
HAS_XATTRIB - Flag to indicate if this segment has XATTRIB records
HAS_ARESTRICT - Flag to indicate if this segment has ARESTRICT records
DATASRC - Table of data sources the ARCs were loaded from
---------------
UID - DATASRC_ID, Unique record ID for table
SRC_TYPE - Type of data source, this probably needs to be a well
defined set.
SRC - SRC identifier, like filename
GEOM_COLUMN - Column name for geometry column if a database
OTHER - SRC_TYPE specific information, like a database
connection string.
NODE - List of unique nodes in the model
---------------
UID - NODE_ID, Unique record ID for table
X - X value for node
Y - Y value for node
Z - Zlevel for node
XATTRIB - List of extended attributes associate with any given ARC
---------------
UID - UID, Unique record id for table
ARC_ID - ARC_ID that this record belongs to
SEQNO - Sequence number in case attributes need order
KEY - Type of attribute
VALUE - Value of attribute
ARESTRICT - List of Restrictions associated with any given ARC
---------------
UID - UID, Unique record id for table
ARC_ID - ARC_ID that this record belongs to
TIME_FROM - Start time for restriction, DATE-TIME or TIME
TIME_TO - End time for restriction, DATE-TIME or TIME
VEHICLE_CLASS - Vehicle class that this restriction belongs to
KEY - Type of restriction
VALUE - Value of restriction
NRESTRICT - List of Restrictions associated with any given NODE
---------------
UID - UID, Unique record id for table
NODE_ID - NODE_ID that this record belongs to
TIME_FROM - Start time for restriction, DATE-TIME or TIME
TIME_TO - End time for restriction, DATE-TIME or TIME
VEHICLE_CLASS - Vehicle class that this restriction belongs to
KEY - Type of restriction
VALUE - Value of restriction
Some Notes and thoughts on this:
In general, things like KEY values need to be a well defined set of
values and should be validated on input.
1) For bicycling and heavy truck hauling we might want to have altitude
loss or gain or grade. This can probably be held in the XATTRIB table.
2) We might want to have a METADATA table with key/value pairs to
indicate what type of restriction and/or attributes the data set
includes. This should also include a VERSION key with an appropriate
value indicating the schema version number.
3) ARESTRICT could be used to indicate, NO TRUCKS, Weight restrictions,
PEDESTRIANS ONLY, 4 Wheel drive only, etc. These can then be used to
eliminate ARCs that might not be appropriate when importing into the
router or to add additional information to the ARCs in the router
depending on features and capabilities of the engine.
4) NRESTRICT could be used to store turn restrictions or any other
restrictions that might need to be applied to a route passing through
this NODE. KEY indicates the type of restriction, and VALUE for TURN
restrictions, would be a list of CHILDREN_IDs, ":", list of
ANCESTOR_IDs. And CHILDREN would be the potential CHILDREN that would be
restricted if the path came from the listed ancestors
5) This does not include information that would be needed for route
explication. This should probably be added. This also does not copy the
ARC geometry to a local table.
6) I'm not sure we need to keep DATASRC table or not. If we keep a local
copy of all the needed data then we might not. There might be some value
if we wanted to extend this to support incremental updates, for example
if a whole state or country was a single data source you could
potentially purge all the ARCs associated with it, and any referential
data records, then import that data again without having to reload the
whole thing.
7) I have not really thought through signage yet. Navteq data has
information about Signs along the route that can be useful during
explication. These may fit in the XATTRIB, but I need to check this and
how you determine which sign(s) are used for which MANEUVER.
8) I also think we can define pseudo-code process of how to load the
data like:
a) open a data source
b) insert a record into the DATASRC table
c) foreach record in the data source
d) insert a record into the ARC
e) add any required XATTRIB or ARESTRICT records to each ARC
f) generate the NODE table from the ARC table
g) and update the ARCs with NODE_IDs
h) Add Turn restrictions to the NRESTRICT table
i) Run sanity checking for:
1. zero or negative cost/length segments
2. isolated single segments
3. other TBD
Sorry it is late here and I can not think through this anymore tonight.
This is a good start and I hope seeing the data laid out like this helps
to show how a simple schema might be useful and that having an
intermediate schema like this might be of value and not too overly
complex to implement.
Questions, Concerns, Issues, Missing information?
Best regards,
-Steve
More information about the Routergeocoder
mailing list