[pgrouting-dev] Time dependent data and input format
Jay Mahadeokar
jai.mahadeokar at gmail.com
Fri Jul 29 13:30:37 EDT 2011
On Fri, Jul 29, 2011 at 10:50 PM, Stephen Woodbridge <
woodbri at swoodbridge.com> wrote:
> On 7/29/2011 1:03 PM, Jay Mahadeokar wrote:
>
>>
>>
>> On Fri, Jul 29, 2011 at 9:54 PM, Stephen Woodbridge
>> <woodbri at swoodbridge.com <mailto:woodbri at swoodbridge.**com<woodbri at swoodbridge.com>>>
>> wrote:
>>
>> On 7/29/2011 11:49 AM, Jay Mahadeokar wrote:
>>
>> Hi Steve,
>>
>> On Fri, Jul 29, 2011 at 6:26 PM, Stephen Woodbridge
>> <woodbri at swoodbridge.com <mailto:woodbri at swoodbridge.**com<woodbri at swoodbridge.com>
>> >
>> <mailto:woodbri at swoodbridge. com
>> <mailto:woodbri at swoodbridge.**com <woodbri at swoodbridge.com>>>>
>> wrote:
>>
>> Jay,
>>
>> This make sense. What I'm trying to understand and get to is
>> how do
>> we make this easy to setup and use. It seems like we are
>> missing
>> something somewhere. For example, the cyclical range of a
>> single
>> entry can be different from other entries, think week days
>> versus
>> weekends. It should be possible to have some cyclic entries
>> and some
>> not cyclic. Based on this it seems that the cyclic flag
>> needs to be
>> set on each record along with the period of the cycle. Would
>> you agree?
>>
>>
>> The main motivation behind all this brain storming is to keep
>> the core
>> algorithm and the postgres c function implementation,
>> independent of the
>> nature of data, right? So, we need to somehow abstract the data
>> dependent properties from the core functionality.
>>
>> That was the reason why we came up with the plsql wrapper functions
>> idea. For optimising again, we need to communicate the cyclic
>> properties
>> to core algorithm, but the interface should be independent of
>> nature of
>> data.
>>
>> If this is the case, then a quick scan of the data would
>> tell us if
>> there are cyclic entries, in which case a simple function to
>> scan
>> the data needed and see if any entry is cyclic could return
>> that
>> flag. But if every entry is marked as cyclic or not then you
>> might
>> not need that flag at all.
>>
>>
>> What I feel is: if we scan the data and keep the logic of
>> determining
>> the cyclic nature within the postgres C function, then the above
>> aim
>> will somehow get lost and it will again become a data format
>> specific
>> function.
>>
>> So, I think it will be better if the application developers deal
>> with
>> the data themselves so that they can achieve the optimisation.
>> Example -
>> for weekday, weekend, the plsql function will provide specific
>> values
>> for each day and mark data as non cyclic if there is transition
>> from
>> weekday to weekend and vice versa. (A journey more than few days is
>> highly unlikely, so anyways it wont repeat the data too much,
>> though it
>> will not be optimal)
>>
>> If the journey is within weekday or within weekend, then we can
>> mark
>> data as cyclic, and just retrieve 24 hour data.
>>
>> This is specific to example data you specified. I believe such
>> improvisations will be applicable for most data formats. Thoughts?
>>
>> Again, this is not the most optimal implementation, but still it
>> keeps
>> the core implementation independent of data format. Any other
>> ideas?
>>
>>
>> Jay,
>>
>> I think you are correct in this. This will require some real world
>> examples of data and how to work with it. The abstraction makes
>> sense but it is hard to map it to reality hence our discussions.
>>
>> So if we propose a hypothetical example like:
>>
>> 1. roadway which it one way between 7-10am north bound on mon-fri
>> 2. same roadway is one way between 4-7pm south bound on mon-fri
>> 3. same roadway is two otherwise
>>
>> A separate type of rule might be:
>>
>> 4. there was an accident at 11am causing reduced speeds for the next
>> 30 mins
>>
>> we could define a table like:
>>
>> time_dependency
>> rule_id:seq:day_of_week:start_ time:end_time:from_cost:to_
>> cost:expiry_date
>> 1:1:2,3,4,5,6:0700:1000:30:-1:
>> 1:2:2,3,4,5,6:1600:1900:-1:30:
>> 1:3:1,7:0000:2400:45:45:
>> 2:1:0:1100:1130:20:20:20110729
>>
>> I hope that is clear.
>> sun-sat === 1-7 for day_of_week
>> cost -1 means you can travel that direction
>>
>> then in the edges table, you can link the edge to time_dependency
>> table via the rule_id
>>
>> This is not perfect because, from_cost and to_cost imply a direction
>> of digitization of the segment which might be reversed on some
>> segments. You might have multiple rules associated with a single
>> segment. I'm thinking of a "rule" as a single description of some
>> conditions that might get reused by multiple edges in the graph. So
>> we might want a linking table like:
>>
>> edge_id:rule_id
>>
>> that will support multiple rules to be associated.
>>
>> Then the stored procedure could be written to query these tables
>> based on start time, elapsed time from start, and edge_id that could
>> return the cost of a single edge. Or you would need to write a
>> stored procedure to transform these tables into whatever structure
>> you need for the routing.
>>
>>
>> Hi Steve,
>>
>> Thanks for coming up with the example. It fits well in order to describe
>> our current plan. So, as proposed we will assume our data is in format
>> (instead of creating ids and seq separately im keeping gid as the id
>> like in ways table):
>>
>> gid: day_of_week: start_time: end_time: from_cost: to_cost: expiry_date
>>
>> Now, the application developer dealing with the data will write plsql
>> function which will query the above table and return set of rows:
>>
>> gid, start_time, end_time, travel_time as expected by the tdsp query.
>>
>> Now, if the data format is such that there are rules associated to
>> specific edges, and there are linking tables etc as you mentioned, then
>> it would be the responsibility of the application developer to come up
>> with join queries etc that will be needed in the plsql wrapper function.
>> We will just expect data in the above format. And we can also keep track
>> of cyclic nature as discussed earlier. There can be many other formats
>> and many other linking table possibilities, but we should not be
>> responsible for those. That was the reason of deciding the input data
>> format during the pre-coding period.
>>
>> I hope I am understanding your point and answering relevantly. Thanks
>> again for coming up with the users perspective. This is really helping
>> me to understand the real-world scenario which I had not thought about
>> particularly. Any other comments?
>>
>
> I think you have it now and I understand it also. The application developer
> can have whatever structure they need to represent the data they have and
> link it to the edges, and then they have to write a procedure that will
> transform that into the record set you defined above that the engine needs.
>
> In the documentation, you can copy my example and/or simplify it a little.
> It is fine to discuss points that they need to consider like direction of
> digitization or how to handle transient one way streets, because these are
> things that people might need to deal with and it gives them an idea of how
> to deal with other data problems that they might be confronted with.
>
>
Sure! I will do that in last two weeks I think. If you can list such points
that we should document, I can come up with details of mapping each scenario
to the required format.
> It would be nice if we could get some real time dependent data and work
> through building a sample application the uses it. Maybe we will have time
> after GSoC to do that.
>
I have been trying to get real world time dependent data since a long time
now. Still I have not found a good data source. It would be great if we can
acquire that, and yes, we can do that after GSoc too!
>
> Thanks,
> -Steve
>
> I just trying to put this into a concrete example so we can work
>> through how this works and to help understand how we need to
>> document it so people can use it in a real world example.
>>
>> Thoughts?
>>
>> -Steve
>>
>> May be I'm missing something because I have not had a full
>> cup of
>> tea yet :)
>>
>> Thoughts,
>> -Steve
>>
>>
>> On 7/29/2011 8:39 AM, Jay Mahadeokar wrote:
>>
>>
>>
>> On Fri, Jul 29, 2011 at 5:50 PM, Stephen Woodbridge
>> <woodbri at swoodbridge.com <mailto:woodbri at swoodbridge.**com<woodbri at swoodbridge.com>
>> >
>> <mailto:woodbri at swoodbridge. com <mailto:woodbri at swoodbridge.**com<woodbri at swoodbridge.com>
>> >>
>> <mailto:woodbri at swoodbridge <mailto:woodbri at swoodbridge>. com
>>
>>
>> <mailto:woodbri at swoodbridge. com
>> <mailto:woodbri at swoodbridge.**com <woodbri at swoodbridge.com>>>>>
>> wrote:
>>
>> On 7/29/2011 5:49 AM, Jay Mahadeokar wrote:
>>
>> Hi all,
>>
>> I was looking at ways to optimise the tdsp
>> implementation.
>>
>> Currently, if the time-interval is suppose 24, (24
>> hours) and
>> the travel
>> time expected is around 60 hours, then the
>> tdsp-wrapper
>> query
>> retrieves
>> total 60 intervals and the data is repeated for
>> the 24 hour
>> intervals.
>> (I hope what i am trying to say is clear).
>>
>> This is unnecessary space wastage. So, I edited the
>> weight_map
>> class to
>> accommodate a variables is_cyclic and the
>> cycle_interval, which will
>> keep track of the cyclic nature of the data. So,
>> actually only
>> the 24
>> hour data will be fed to weight map and then it
>> will be
>> reused
>> in cycles.
>>
>> Because of the good design, I guess this has
>> scaled up
>> really
>> nice and
>> there is no need to modify actual core tdsp.
>> Just the
>> weight_map's
>> get_travel_time() function and the wrapper plsql
>> function needed
>> to be
>> altered.
>>
>>
>> This makes perfect sense and sounds like a great
>> optimization.
>>
>>
>> Now, in the main query, I am thinking of adding
>> one more
>> boolean
>> variable - is_cyclic which will be enabled if
>> the data
>> is cyclic in
>> nature. Should we keep this? Or we should
>> assume that
>> the data
>> will be
>> always cyclic.
>> Any other views on the same are welcome.
>>
>>
>> I guess my questions are:
>> How does this get set?
>> What are the advantages of setting it?
>> Is there a way to dynamically check if it is cyclic
>> at the
>> start and
>> avoid having the app builder set it up?
>>
>>
>> Advantages of setting it:
>> If the data is not cyclic, then the whole data will be
>> expected to
>> reside in database itself. So, if the upper bound of the
>> time is
>> reached
>> then by default, we can assume that the travel_time
>> corresponding to the
>> last existing time window should be used for such cases.
>>
>> If data is cyclic, then we need to start all over again.
>> These
>> are the
>> cases, what I could think of.
>>
>> How does this get set:
>> The tdsp query must have an additional parameter -
>> is_cyclic which
>> should be used to set the is_cyclic flag of the weight_map.
>>
>> Now, if data is cyclic, then the plsql function that
>> retrieves
>> the data
>> should be written in such a way that if interval is
>> suppose 0 -
>> 24, and
>> start time is 21, then 21->0 and 20->23. After that the
>> cyclic flag
>> will indicate us to loop again. The plsql function will be
>> generally
>> written by app developer and he should take care of
>> this. (I have
>> already implemented this in my code.)
>>
>> If data is cyclic, he should pass the flag appropriately.
>> If
>> is_cyclic
>> is false, we can assume that if intervals are exhausted,
>> we will
>> keep
>> using time corresponding to last interval.
>>
>> Is this reasonable?
>>
>> I suppose it is not a big deal for the app builder
>> to set
>> this flag
>> since he will know if the data loaded is cyclic in
>> nature. I
>> assume
>> you are only looking for some indication if there
>> are cyclic
>> entries
>> in the data and not if this specific query has a cyclic
>> nature which
>> would be impossible to know without analyzing the
>> start time
>> and max
>> time window, in which case we should figure that out
>> automatically
>> because the user making the request is not likely to
>> know
>> anything
>> more than start, end and start time or end time.
>>
>> -Steve
>>
>> --
>> Regards,
>> -Jay Mahadeokar
>>
> ______________________________**_________________
> pgrouting-dev mailing list
> pgrouting-dev at lists.osgeo.org
> http://lists.osgeo.org/**mailman/listinfo/pgrouting-dev<http://lists.osgeo.org/mailman/listinfo/pgrouting-dev>
>
--
Regards,
-Jay Mahadeokar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osgeo.org/pipermail/pgrouting-dev/attachments/20110729/c7e58d75/attachment-0001.html
More information about the pgrouting-dev
mailing list