[pgrouting-dev] Time dependent data and input format
Jay Mahadeokar
jai.mahadeokar at gmail.com
Tue Aug 2 18:08:16 EDT 2011
On Sun, Jul 31, 2011 at 1:30 AM, Stephen Woodbridge <woodbri at swoodbridge.com
> wrote:
> On 7/30/2011 2:44 PM, Jay Mahadeokar wrote:
>
>>
>>
>> On Sat, Jul 30, 2011 at 11:22 PM, Stephen Woodbridge
>> <woodbri at swoodbridge.com <mailto:woodbri at swoodbridge.**com<woodbri at swoodbridge.com>>>
>> wrote:
>>
>> Jay,
>>
>> After reading your report, I'm still struggling with the is_cyclic
>> flag. I think the reason is because this seems to be a global flag
>> and the fact is that some records might be cyclic and some might not
>> be cyclic so the flag needs to be set at the record level not as a
>> global. If we set the is_cyclic flag and have two time records on is
>> cyclic and the other is not, how do we know which is which?
>>
>> tdsp expects a table of:
>>
>>
>> gid, start_time, end_time, travel_time
>>
>> So how do I translate my example table:
>>
>>
>> time_dependency
>> rule_id:seq:day_of_week:start_ time:end_time:from_cost:to_
>> cost:expiry_date
>> 1:1:2,3,4,5,6:0700:1000:30:-1:
>> 1:2:2,3,4,5,6:1600:1900:-1:30:
>> 1:3:1,7:0000:2400:45:45:
>> 2:1:0:1100:1130:20:20:20110729
>>
>> into this? Note: rule_id=1 is cyclic, rule_id=2 is not.
>>
>> For example lets say the route start 20110729 (friday) 6AM and ends
>> 20110801 (monday) 8PM. What would be the records we would need to
>> return from the above table? For simplicity, lets assume rule_id is
>> the gid.
>>
>>
>> Hi Steve,
>>
>> The is_cyclic flag will be returned by the plsql query that generates
>> the time dependent data. This will be done according to the nature of
>> the data, start-time, end-time etc and the logic for that will be in the
>> plsql function.
>>
>> The example you gave, for route start friday to monday, the plsql
>> function should return the data as not cyclic and give the specific time
>> intervals from start to end. I think this is
>>
>> If query is from say monday to thursday, then the cyclic flag can be set
>> to true and data corresponding to the 24 hour interval only can be
>> returned. If there is transition from weekday to weekend and vice versa
>> then data should be marked as not cyclic (which we discussed yesterday).
>>
>> I am realising that this may not be the most optimal way, but I think
>> this optimisation is still just a luxury, since usually travel times
>> won't last more than a week in real world scenarios :).
>>
>> On second thought, Lets see if we keep the is_cyclic flag with each time
>> interval as you are suggesting.
>>
>> So, suppose the wrapper plsql is now returning -
>>
>> gid, start_time, end_time, travel_time, is_cyclic
>>
>> Now, if data is not cyclic from 11 to 11:30 on 20110729, then we will
>> have to take that for just one time and if it again cycles, we will have
>> to take alternate value for same time next week.
>> If some interval is not cyclic, I think it will be included only in the
>> first cycle, as per your example I am not sure if this property is
>> generic, but I think it might be.
>>
>> The logic to check all this would have to be within the weight_map
>> class. I can easily detail out algorithm for solving your example. But,
>> for each cyclic interval information like how many times it will repeat,
>> what will be the ordering etc should be provided to weight_map class.
>>
>> Ex - intervals corresp to weekdays repeat 5 times, intervals corresp to
>> weekend repeat 2 times, while there is 1 non repeating interval.
>> These intricacies will be more if we imagine a more complex data
>> format. Since all times are offset from start_time, all this
>> information will have to be delivered in some standard format which
>> should not be example specific.
>>
>> I think these features need a detailed understanding and discussion. We
>> can surely continue to brainstorm over the issues. But I am not sure if
>> we can implement this within GSoc timeline. So, maybe we can put that in
>> TDSP Version 1.1?
>>
>
> Jay,
>
> Ok, sounds like the optimization is a little premature but otherwise a good
> idea. I think that after GSoC, we need to look at time dependent handling
> and see if we can come up with something general that can be used for this
> code and for multimodal scheduling. It would be nice to have a general
> facility that any routing code that needs time dependent data can use and
> easily plug into. I think the best way to do this is to start with use cases
> for:
>
> 1. time dependent traffic data and events
> 2. mass transit bus, train, subway schedules
> 3. other specific use cases
>
> Then from these, abstract the table definitions that can hold this data in
> a format that is easy to enter, edit, and manage. Then look at how we
> transform that data into a form that a specific query needs to solve a given
> problem.
>
> And finally, we should be able to retrofit the existing code to use these
> facilities.
>
> Regards your project, I would leave things as is and move forward. In your
> final report you should have a discussion of things we learned and future
> directions that should identify the above as one of those.
>
>
Thanks for your suggestions. I agree with your view. So, I think I should
not tinker with the existing working algorithm. I will keep the tdsp
separate and create a new tdsp_cyclic which will be used in case of cyclic
data. I am working on that now using the existing ideas mentioned above. I
will soon come up with the documentation and example.
Our college has also started since last week and so, I am kind of occupied
there too. So, it will be best if we make the current implementation ready
with documentation until GSoc deadline and work on your ideas later.
> I'm open to other ideas or expanding on this if people have any additional
> thoughts.
>
> Thanks,
> -Steve
>
> Also I just noticed that you assume the edges are not directed so
>> the cost is the same traversing from A-B and B-A. Actually, that is
>> probably not true, if gid is the directed edge and not the GIS edge
>> which might get entered twice in a directed graph. This is another
>> issue that the app developer needs to think about because GIS data
>> has a single edge with attributes that then might generate two graph
>> edges and then need to map time depend GIS edges back to graph
>> edges. Oh what a tangles web we weave :)
>>
>>
>> Actually I assume that edges are directed.
>> https://github.com/pgRouting/**pgrouting/wiki/TDSP-Details<https://github.com/pgRouting/pgrouting/wiki/TDSP-Details>.
>> Refer section 3.
>>
>>
>> -Steve
>>
>>
>> On 7/29/2011 1:30 PM, Jay Mahadeokar wrote:
>>
>>
>>
>> On Fri, Jul 29, 2011 at 10:50 PM, Stephen Woodbridge
>> <woodbri at swoodbridge.com <mailto:woodbri at swoodbridge.**com<woodbri at swoodbridge.com>
>> >
>> <mailto:woodbri at swoodbridge. com
>> <mailto:woodbri at swoodbridge.**com <woodbri at swoodbridge.com>>>>
>> wrote:
>>
>> On 7/29/2011 1:03 PM, Jay Mahadeokar wrote:
>>
>>
>>
>> On Fri, Jul 29, 2011 at 9:54 PM, Stephen Woodbridge
>> <woodbri at swoodbridge.com <mailto:woodbri at swoodbridge.**com<woodbri at swoodbridge.com>
>> >
>> <mailto:woodbri at swoodbridge. com <mailto:woodbri at swoodbridge.**com<woodbri at swoodbridge.com>
>> >>
>> <mailto:woodbri at swoodbridge <mailto:woodbri at swoodbridge>. com
>> <mailto:woodbri at swoodbridge. com
>> <mailto:woodbri at swoodbridge.**com <woodbri at swoodbridge.com>>>>>
>> wrote:
>>
>> On 7/29/2011 11:49 AM, Jay Mahadeokar wrote:
>>
>> Hi Steve,
>>
>> On Fri, Jul 29, 2011 at 6:26 PM, Stephen Woodbridge
>> <woodbri at swoodbridge.com <mailto:woodbri at swoodbridge.**com<woodbri at swoodbridge.com>
>> >
>> <mailto:woodbri at swoodbridge. com <mailto:woodbri at swoodbridge.**com<woodbri at swoodbridge.com>
>> >>
>> <mailto:woodbri at swoodbridge <mailto:woodbri at swoodbridge>. com
>> <mailto:woodbri at swoodbridge. com <mailto:woodbri at swoodbridge.**com<woodbri at swoodbridge.com>
>> >>>
>> <mailto:woodbri at swoodbridge <mailto:woodbri at swoodbridge>
>> <mailto:woodbri at swoodbridge <mailto:woodbri at swoodbridge>>. com
>> <mailto:woodbri at swoodbridge <mailto:woodbri at swoodbridge>. com
>> <mailto:woodbri at swoodbridge. com
>> <mailto:woodbri at swoodbridge.**com <woodbri at swoodbridge.com>>>>>>
>> wrote:
>>
>> Jay,
>>
>> This make sense. What I'm trying to
>> understand and
>> get to is
>> how do
>> we make this easy to setup and use. It seems
>> like we
>> are missing
>> something somewhere. For example, the
>> cyclical range
>> of a single
>> entry can be different from other entries,
>> think
>> week days
>> versus
>> weekends. It should be possible to have some
>> cyclic
>> entries
>> and some
>> not cyclic. Based on this it seems that the
>> cyclic flag
>> needs to be
>> set on each record along with the period of the
>> cycle. Would
>> you agree?
>>
>>
>> The main motivation behind all this brain
>> storming is to
>> keep
>> the core
>> algorithm and the postgres c function
>> implementation,
>> independent of the
>> nature of data, right? So, we need to somehow
>> abstract
>> the data
>> dependent properties from the core functionality.
>>
>> That was the reason why we came up with the plsql
>> wrapper functions
>> idea. For optimising again, we need to
>> communicate the
>> cyclic
>> properties
>> to core algorithm, but the interface should be
>> independent of
>> nature of
>> data.
>>
>> If this is the case, then a quick scan of
>> the data would
>> tell us if
>> there are cyclic entries, in which case a
>> simple
>> function to
>> scan
>> the data needed and see if any entry is
>> cyclic could
>> return that
>> flag. But if every entry is marked as cyclic
>> or not
>> then you
>> might
>> not need that flag at all.
>>
>>
>> What I feel is: if we scan the data and keep the
>> logic of
>> determining
>> the cyclic nature within the postgres C
>> function, then
>> the above aim
>> will somehow get lost and it will again become a
>> data format
>> specific
>> function.
>>
>> So, I think it will be better if the application
>> developers deal
>> with
>> the data themselves so that they can achieve the
>> optimisation.
>> Example -
>> for weekday, weekend, the plsql function will
>> provide
>> specific
>> values
>> for each day and mark data as non cyclic if there
>> is
>> transition from
>> weekday to weekend and vice versa. (A journey
>> more than
>> few days is
>> highly unlikely, so anyways it wont repeat the
>> data too
>> much,
>> though it
>> will not be optimal)
>>
>> If the journey is within weekday or within
>> weekend, then
>> we can mark
>> data as cyclic, and just retrieve 24 hour data.
>>
>> This is specific to example data you specified. I
>> believe such
>> improvisations will be applicable for most data
>> formats.
>> Thoughts?
>>
>> Again, this is not the most optimal
>> implementation, but
>> still it
>> keeps
>> the core implementation independent of data
>> format. Any
>> other ideas?
>>
>>
>> Jay,
>>
>> I think you are correct in this. This will require
>> some real
>> world
>> examples of data and how to work with it. The
>> abstraction makes
>> sense but it is hard to map it to reality hence our
>> discussions.
>>
>> So if we propose a hypothetical example like:
>>
>> 1. roadway which it one way between 7-10am north
>> bound on
>> mon-fri
>> 2. same roadway is one way between 4-7pm south bound
>> on mon-fri
>> 3. same roadway is two otherwise
>>
>> A separate type of rule might be:
>>
>> 4. there was an accident at 11am causing reduced
>> speeds for
>> the next
>> 30 mins
>>
>> we could define a table like:
>>
>> time_dependency
>> rule_id:seq:day_of_week:start_
>> time:end_time:from_cost:to_
>> cost:expiry_date
>> 1:1:2,3,4,5,6:0700:1000:30:-1:
>> 1:2:2,3,4,5,6:1600:1900:-1:30:
>> 1:3:1,7:0000:2400:45:45:
>> 2:1:0:1100:1130:20:20:20110729
>>
>> I hope that is clear.
>> sun-sat === 1-7 for day_of_week
>> cost -1 means you can travel that direction
>>
>> then in the edges table, you can link the edge to
>> time_dependency
>> table via the rule_id
>>
>> This is not perfect because, from_cost and to_cost
>> imply a
>> direction
>> of digitization of the segment which might be
>> reversed on some
>> segments. You might have multiple rules associated
>> with a single
>> segment. I'm thinking of a "rule" as a single
>> description of
>> some
>> conditions that might get reused by multiple edges
>> in the
>> graph. So
>> we might want a linking table like:
>>
>> edge_id:rule_id
>>
>> that will support multiple rules to be associated.
>>
>> Then the stored procedure could be written to query
>> these tables
>> based on start time, elapsed time from start, and
>> edge_id
>> that could
>> return the cost of a single edge. Or you would need
>> to write a
>> stored procedure to transform these tables into
>> whatever
>> structure
>> you need for the routing.
>>
>>
>> Hi Steve,
>>
>> Thanks for coming up with the example. It fits well in
>> order to
>> describe
>> our current plan. So, as proposed we will assume our
>> data is in
>> format
>> (instead of creating ids and seq separately im keeping
>> gid as the id
>> like in ways table):
>>
>> gid: day_of_week: start_time: end_time: from_cost: to_cost:
>> expiry_date
>>
>> Now, the application developer dealing with the data
>> will write
>> plsql
>> function which will query the above table and return set
>> of rows:
>>
>> gid, start_time, end_time, travel_time as expected by
>> the tdsp
>> query.
>>
>> Now, if the data format is such that there are rules
>> associated to
>> specific edges, and there are linking tables etc as you
>> mentioned, then
>> it would be the responsibility of the application
>> developer to
>> come up
>> with join queries etc that will be needed in the plsql
>> wrapper
>> function.
>> We will just expect data in the above format. And we can
>> also
>> keep track
>> of cyclic nature as discussed earlier. There can be
>> many other
>> formats
>> and many other linking table possibilities, but we
>> should not be
>> responsible for those. That was the reason of deciding
>> the input
>> data
>> format during the pre-coding period.
>>
>> I hope I am understanding your point and answering
>> relevantly.
>> Thanks
>> again for coming up with the users perspective. This is
>> really
>> helping
>> me to understand the real-world scenario which I had not
>> thought
>> about
>> particularly. Any other comments?
>>
>>
>> I think you have it now and I understand it also. The
>> application
>> developer can have whatever structure they need to represent
>> the
>> data they have and link it to the edges, and then they have
>> to write
>> a procedure that will transform that into the record set you
>> defined
>> above that the engine needs.
>>
>> In the documentation, you can copy my example and/or
>> simplify it a
>> little. It is fine to discuss points that they need to
>> consider like
>> direction of digitization or how to handle transient one way
>> streets, because these are things that people might need to
>> deal
>> with and it gives them an idea of how to deal with other data
>> problems that they might be confronted with.
>>
>>
>> Sure! I will do that in last two weeks I think. If you can list
>> such
>> points that we should document, I can come up with details of
>> mapping
>> each scenario to the required format.
>>
>> It would be nice if we could get some real time dependent
>> data and
>> work through building a sample application the uses it. Maybe
>> we
>> will have time after GSoC to do that.
>>
>>
>> I have been trying to get real world time dependent data since a
>> long
>> time now. Still I have not found a good data source. It would be
>> great
>> if we can acquire that, and yes, we can do that after GSoc too!
>>
>>
>> Thanks,
>> -Steve
>>
>> I just trying to put this into a concrete example so
>> we can work
>> through how this works and to help understand how we
>> need to
>> document it so people can use it in a real world
>> example.
>>
>> Thoughts?
>>
>> -Steve
>>
>> May be I'm missing something because I have
>> not had
>> a full
>> cup of
>> tea yet :)
>>
>> Thoughts,
>> -Steve
>>
>>
>> On 7/29/2011 8:39 AM, Jay Mahadeokar wrote:
>>
>>
>>
>> On Fri, Jul 29, 2011 at 5:50 PM, Stephen
>> Woodbridge
>> <woodbri at swoodbridge.com <mailto:woodbri at swoodbridge.**com<woodbri at swoodbridge.com>
>> >
>> <mailto:woodbri at swoodbridge. com <mailto:woodbri at swoodbridge.**com<woodbri at swoodbridge.com>
>> >>
>> <mailto:woodbri at swoodbridge <mailto:woodbri at swoodbridge>. com
>> <mailto:woodbri at swoodbridge. com <mailto:woodbri at swoodbridge.**com<woodbri at swoodbridge.com>
>> >>>
>> <mailto:woodbri at swoodbridge <mailto:woodbri at swoodbridge>
>> <mailto:woodbri at swoodbridge <mailto:woodbri at swoodbridge>>. com
>> <mailto:woodbri at swoodbridge <mailto:woodbri at swoodbridge>. com
>> <mailto:woodbri at swoodbridge. com <mailto:woodbri at swoodbridge.**com<woodbri at swoodbridge.com>
>> >>>>
>> <mailto:woodbri at swoodbridge <mailto:woodbri at swoodbridge>
>> <mailto:woodbri at swoodbridge <mailto:woodbri at swoodbridge>>
>> <mailto:woodbri at swoodbridge <mailto:woodbri at swoodbridge>
>> <mailto:woodbri at swoodbridge <mailto:woodbri at swoodbridge>>>**. com
>>
>>
>> <mailto:woodbri at swoodbridge <mailto:woodbri at swoodbridge>
>> <mailto:woodbri at swoodbridge <mailto:woodbri at swoodbridge>>. com
>> <mailto:woodbri at swoodbridge <mailto:woodbri at swoodbridge>. com
>> <mailto:woodbri at swoodbridge. com
>> <mailto:woodbri at swoodbridge.**com <woodbri at swoodbridge.com>>>>>>>
>> wrote:
>>
>> On 7/29/2011 5:49 AM, Jay Mahadeokar
>> wrote:
>>
>> Hi all,
>>
>> I was looking at ways to
>> optimise the tdsp
>> implementation.
>>
>> Currently, if the time-interval is
>> suppose 24, (24
>> hours) and
>> the travel
>> time expected is around 60
>> hours, then the
>> tdsp-wrapper
>> query
>> retrieves
>> total 60 intervals and the data is
>> repeated for
>> the 24 hour
>> intervals.
>> (I hope what i am trying to say
>> is clear).
>>
>> This is unnecessary space
>> wastage. So, I
>> edited the
>> weight_map
>> class to
>> accommodate a variables
>> is_cyclic and the
>> cycle_interval, which will
>> keep track of the cyclic nature
>> of the
>> data. So,
>> actually only
>> the 24
>> hour data will be fed to weight
>> map and
>> then it
>> will be
>> reused
>> in cycles.
>>
>> Because of the good design, I
>> guess this has
>> scaled up
>> really
>> nice and
>> there is no need to modify
>> actual core tdsp.
>> Just the
>> weight_map's
>> get_travel_time() function and the
>> wrapper plsql
>> function needed
>> to be
>> altered.
>>
>>
>> This makes perfect sense and sounds
>> like a great
>> optimization.
>>
>>
>> Now, in the main query, I am
>> thinking of
>> adding
>> one more
>> boolean
>> variable - is_cyclic which will be
>> enabled if
>> the data
>> is cyclic in
>> nature. Should we keep this? Or
>> we should
>> assume that
>> the data
>> will be
>> always cyclic.
>> Any other views on the same are
>> welcome.
>>
>>
>> I guess my questions are:
>> How does this get set?
>> What are the advantages of setting it?
>> Is there a way to dynamically check
>> if it is
>> cyclic
>> at the
>> start and
>> avoid having the app builder set it up?
>>
>>
>> Advantages of setting it:
>> If the data is not cyclic, then the
>> whole data
>> will be
>> expected to
>> reside in database itself. So, if the upper
>> bound of the
>> time is
>> reached
>> then by default, we can assume that the
>> travel_time
>> corresponding to the
>> last existing time window should be used
>> for
>> such cases.
>>
>> If data is cyclic, then we need to start
>> all
>> over again.
>> These
>> are the
>> cases, what I could think of.
>>
>> How does this get set:
>> The tdsp query must have an additional
>> parameter -
>> is_cyclic which
>> should be used to set the is_cyclic flag
>> of the
>> weight_map.
>>
>> Now, if data is cyclic, then the plsql
>> function that
>> retrieves
>> the data
>> should be written in such a way that if
>> interval is
>> suppose 0 -
>> 24, and
>> start time is 21, then 21->0 and
>> 20->23. After
>> that the
>> cyclic flag
>> will indicate us to loop again. The plsql
>> function will be
>> generally
>> written by app developer and he should
>> take care of
>> this. (I have
>> already implemented this in my code.)
>>
>> If data is cyclic, he should pass the flag
>> appropriately. If
>> is_cyclic
>> is false, we can assume that if
>> intervals are
>> exhausted,
>> we will
>> keep
>> using time corresponding to last interval.
>>
>> Is this reasonable?
>>
>> I suppose it is not a big deal for
>> the app
>> builder
>> to set
>> this flag
>> since he will know if the data loaded
>> is
>> cyclic in
>> nature. I
>> assume
>> you are only looking for some
>> indication if
>> there
>> are cyclic
>> entries
>> in the data and not if this specific
>> query
>> has a cyclic
>> nature which
>> would be impossible to know without
>> analyzing the
>> start time
>> and max
>> time window, in which case we should
>> figure
>> that out
>> automatically
>> because the user making the request
>> is not
>> likely to
>> know
>> anything
>> more than start, end and start time
>> or end time.
>>
>> -Steve
>>
>> --
>> Regards,
>> -Jay Mahadeokar
>>
>> ______________________________ _________________
>> pgrouting-dev mailing list
>> pgrouting-dev at lists.osgeo.org
>> <mailto:pgrouting-dev at lists.**osgeo.org<pgrouting-dev at lists.osgeo.org>
>> >
>> <mailto:pgrouting-dev at lists. osgeo.org
>> <mailto:pgrouting-dev at lists.**osgeo.org<pgrouting-dev at lists.osgeo.org>
>> >>
>>
>> http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
>> <http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
>> <http://lists.osgeo.org/**mailman/listinfo/pgrouting-dev<http://lists.osgeo.org/mailman/listinfo/pgrouting-dev>
>> **> >
>>
>>
>>
>>
>> --
>> Regards,
>> -Jay Mahadeokar
>>
>>
>>
>> ______________________________ _________________
>> pgrouting-dev mailing list
>> pgrouting-dev at lists.osgeo.org <mailto:pgrouting-dev at lists.**
>> osgeo.org <pgrouting-dev at lists.osgeo.org>>
>> http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
>> <http://lists.osgeo.org/**mailman/listinfo/pgrouting-dev<http://lists.osgeo.org/mailman/listinfo/pgrouting-dev>
>> **>
>>
>>
>> ______________________________ _________________
>> pgrouting-dev mailing list
>> pgrouting-dev at lists.osgeo.org <mailto:pgrouting-dev at lists.**osgeo.org<pgrouting-dev at lists.osgeo.org>
>> >
>> http://lists.osgeo.org/ mailman/listinfo/pgrouting-dev
>> <http://lists.osgeo.org/**mailman/listinfo/pgrouting-dev<http://lists.osgeo.org/mailman/listinfo/pgrouting-dev>
>> **>
>>
>>
>>
>>
>> --
>> Regards,
>> -Jay Mahadeokar
>>
>>
>>
>> ______________________________**_________________
>> pgrouting-dev mailing list
>> pgrouting-dev at lists.osgeo.org
>> http://lists.osgeo.org/**mailman/listinfo/pgrouting-dev<http://lists.osgeo.org/mailman/listinfo/pgrouting-dev>
>>
>
> ______________________________**_________________
> pgrouting-dev mailing list
> pgrouting-dev at lists.osgeo.org
> http://lists.osgeo.org/**mailman/listinfo/pgrouting-dev<http://lists.osgeo.org/mailman/listinfo/pgrouting-dev>
>
--
Regards,
-Jay Mahadeokar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osgeo.org/pipermail/pgrouting-dev/attachments/20110803/81c4d324/attachment-0001.html
More information about the pgrouting-dev
mailing list