[pgrouting-dev] Time dependent data and input format

Sun Jun 12 10:56:33 EDT 2011

On 6/12/2011 12:33 AM, Jay Mahadeokar wrote:
>
>
> On Sun, Jun 12, 2011 at 1:07 AM, Stephen Woodbridge
> <woodbri at swoodbridge.com <mailto:woodbri at swoodbridge.com>> wrote:
>
>     On 6/11/2011 12:58 PM, Jay Mahadeokar wrote:
>
>
>
>         On Thu, Jun 9, 2011 at 9:16 AM, Jay Mahadeokar
>         <jai.mahadeokar at gmail.com <mailto:jai.mahadeokar at gmail.com>
>         <mailto:jai.mahadeokar at gmail.com
>         <mailto:jai.mahadeokar at gmail.com>>> wrote:
>
>             Hi,
>
>             Just a recap to our planned tdsp query model:
>
>
>
>                 I agree. So, now can we assume we have following database
>                 tables, as per things discussed till now:
>
>                    1. Ways: (On lines of the one in pgrouting-workshop)
>
>
>                     * gid
>                     * source
>                     * target
>                     * name
>                     * the_geom
>                     * static_length    (In case time dependent routing
>         is not used)
>                     * any other attributes..
>
>                      2.  TimeVaryingCosts (Any other suitable name?)
>
>                     * gid
>                     * start_time_and_date  (May divide this into two
>         columns:
>                       day_of_week and time_of_day ? )
>                     * travel_time / arrival_time
>
>                      3. TurnRestrictions (Time dependent)
>
>                     * node/source
>                     * time_from
>                     * time_to
>                     * node_to
>                     * ancestors
>
>
>                 If only static shortest path query is made, then only
>         table 1 is
>                 required.
>                 If time-dependent shortest path query is made, table 1 and 2
>                 will be used.
>                 If in time-dependent shortest path query with turn
>         restrictions
>                 is made , all 3 tables will be used.
>
>
>             So, for a general time dependent query, we will fetch edge
>             information from ways table and the time dependent costs
>         information
>             from the
>             TimeDepCosts table. (See previous msg above.)
>
>             The query on ways table can be supplied by the usual as on
>         normal
>             dijkstra shortest_path() query.
>
>             How do we form the query on the TimeDepCosts table? I see
>         two options:
>
>             1. We generate the query according to the gids obtained from the
>             query on ways table. We will have to query all the time slots
>             available, or form a query so that the time-window is
>         appropriate
>             and we have sufficient travel_time info so that we can reach
>             destination.
>
>             Here we can form query like
>
>             Select * from TimeDepCosts where gid in (list of gids
>         queried from
>             ways table).
>
>             We can allow user to more filtering parameters according to
>             start_time , date ?
>             But this will make our query specific to the database design. As
>             already discussed, we cant predict the time units /
>         requirements of
>             applications and so it would be better if we keep the query
>             independent of the database design.
>
>             Thats the reason our weight_map assumes start time from
>         source as 0,
>             and all travel times offset from the start time from source. A
>             function that will generate such weight map is to be written.
>
>             2. We allow user to pass the query for TimeDepCosts table
>         too. So,
>             we will have our pgsql function as:
>
>             CREATE  OR  REPLACE  FUNCTION  shortest_path(
>                                                              ways_sql  text,
>
>
>
>         time_dep_cost_sql text,
>                                                              source_id
>           integer,
>
>                                                              target_id
>           integer,
>                                                              directed
>           boolean,
>
>
>         has_reverse_cost  boolean)
>                      RETURNS  SETOF  path_result
>
>
>             This way, user can have more control on the amount of data being
>             queried. We will only consider the time dependent windows
>         that fit
>             the query.
>
>             Example of the time_dep_cost_sql can be:
>
>             Select * from TimeDepCosts where " USER_SUPPLIED_CONSTRAINTS".
>
>             The user can now supply his constraints according to
>         time_window he
>             needs, start times, holidays etc according to the database
>         table design.
>
>             The problem with this is, we cannot filter out edges
>         according to
>             bounding box etc since the_geom field is not there in this
>         table.
>             This is because that information would be redundant.
>
>             Thoughts? Any idea that combines the merits of both approaches?
>
>
>
>         Hi all,
>
>         I am writing some background code will be needed for the above
>         functionality.
>
>         Any input on the above mentioned point? I suppose I should not start
>         coding for any specific approach until we agree upon a viable query
>         strategy.
>
>
>     Hi Jay,
>
>     I always find it easiest to answer these questions by thinking about
>     the problem in terms of use cases, so:
>
>     A web app, where someone enters a start and destination and some
>     constraints like departure time or arrival time and maybe
>     restrictions on modes of travel.
>
>     Are there other ones we are trying to support? These should probably
>     be documented as part of the project description.
>
>
> I agree, we need to document the use-cases. A separate thread for this
> discussion?
>
>
>     Anyway, assuming the above, then the application which presumably
>     has some basic knowledge of the time constraints could do something
>     like:
>
>     Get the straight line distance between start and destination double
>     it and apply some average transportation speed to approximate the
>     time window. The numbers can be changed to be acceptable by the
>     application as long as it over estimates the duration of travel. It
>     can then, specify the CONSTRAINTS in terms of a start and end time.
>     We do something similar with our spatial bounding box used to
>     collect the edges needed to build the graph.
>
>     Now as far as collecting the constraints and making then accessible
>     for the Boost_dijkstra, that is another problem. This problem boils
>     down to the following:
>
>
> Just a correction, in this case we will not be using boost_dijkstra, but
> instead core_tdsp() function which we have build ourselves.
>
>
>     1. is there an acceptable abstraction that we can use internally to
>     represent various data sets that we might want to load. Or can we
>     define a schema that will work for some set of constraints we wish
>     to implement initially.
>
>
> Well, we had agreed upon the following schema for testing purpose:

Yes, you are correct, sorry I was not paying attention when I wrote that.

>    1.
>
>       Ways: (On lines of the one in pgrouting-workshop)
>
>           * gid
>           * source
>           * target
>           * name
>           * the_geom
>           * static_length (In case time dependent routing is not used)
>           * any other attributes
>    2.
>
>       TimeDepCosts
>
>           * gid
>           * day_of_week
>           * time_of_day
>           * travel_time / arrival_time
>
> The time_dep_costs table schema can vary according to application and
> its needs and the nature of data available.
>
>     2. do want to move this data into some static memory structures for
>     the analysis like we do for the graph and then dispose of them after
>     the analysis.
>
>
> We are doing exactly that! The Weight_Map data structure that is
> supplied to the core_tdsp() is designed specifically to fulfil the
> above functionality.  It assumes the start time as 0, and all other
> times offset from start time.  So, there will be schema specific wrapper
> functions that will help generate the weight_map data structure.
>
> Which brings me back, to my original question (couple of messages back),
> what is the best way to generate weight_map, ie retrieve data from time
> dependent cost table.
>
> Well, I think for now, I am going to go with approach #1. i.e user will
> supply his query. This might retrieve unwanted data, since we cant

Yes, I think this is the way to go. It is up to the user that presumable 
knows his data to formulate an efficient query to retrieve the needed data.

> filter by the_geom in table#2. But, I think we can refine this by adding
> constraints like: where gid IN (list_of_edgeids_retrieved_from_ways)

A query something like:

select a.*
   from table2 a, table1 b
  where b.the_geom && <BBOX_OF_REGION>
    and b.gid=a.gid
    and <TIME_CONSTRAINTS>;

This would allow the planner to pick the best index.

> So, I guess this approach combines the merits of both approaches
> discussed before. We can provide flexibility to application as well as
> add constraint on the number of rows retrieved.
>
> Thoughts?

Yes, I think this is a good plan.

I think that the bulk of the roads in the network will probably not have 
time dependent data associated with them. If you look at what Navteq and 
the Traffic people (however they are) are doing, you will notice that 
mostly they only track major highways in/out of major metropolitan 
areas. They do not track that information on minor roads or rural areas. 
So I think that the number of segments that are likely to have time 
dependent data for any given request is going to be in the 5% of the 
total segments. Even if someone were to collect say GPS cellphone 
average speeds over a much wider range of segments, I still think it is 
manageable. Anyway we just have to solve the problem, they have to 
represent that data in a way the we can handle it quickly or this will 
not be a solution for them.

-Steve

>     3. or do we want to construct a method that can dynamically request
>     constraint information from the database when it is needed by a
>     visitor. (2. is probably much more efficient and the better way to
>     go from a complexity and stability point of view).
>
>     Don't wait on me for additional input as I will be gone through June
>     21st. Keep it simple and modular and we can change or add to it in
>     the future and I;m sure whatever you come up with will be fine.
>     After all it is code and it can be changed it we need to. :)
>
>     Project seems to be moving along fine.
>
>     -Steve
>
>     _______________________________________________
>     pgrouting-dev mailing list
>     pgrouting-dev at lists.osgeo.org <mailto:pgrouting-dev at lists.osgeo.org>
>     http://lists.osgeo.org/mailman/listinfo/pgrouting-dev
>
>
>
>
> --
> Regards,
> -Jay Mahadeokar
>
>
>
> _______________________________________________
> pgrouting-dev mailing list
> pgrouting-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/pgrouting-dev