[postgis-users] Non-linear time cost; please suggest a better way to structure the query
Florian Nadler
florian.nadler at cybertec.at
Sat Feb 26 03:56:56 PST 2022
Hi Rory,
I recommend to start with the latest MobilityDB extension (docker image
seems to be outdated) and import and partition the Berlin Workshop data,
as mentioned in
https://docs.mobilitydb.com/MobilityDB-BerlinMOD/master/mobilitydb-berlinmod.pdf.
The query below should give a good starting point for your analysis -
for a given period, output the average time cars remain in regions.
select carid,
regionid,
avg(duration(
(atgeometry((atPeriod(T.Trip,'[2007-05-28 00:00:00+00, 2007-05-29 00:00:00+00]'::period)),
R.geom))::tgeompoint))
FROM Trips T,
Regions R WHERE T.Trip && stbox(R.Geom,'[2007-05-28 00:00:00+00, 2007-05-29 00:00:00+00]'::period)
AND st_intersects(trajectory(atPeriod(T.Trip,'[2007-05-28 00:00:00+00, 2007-05-29 00:00:00+00]'::period)),
R.Geom)
group by carid,regionid
Just let me know if this works out for.
Florian
Am 24.02.2022 um 08:53 schrieb Rory Meyer:
> I haven't looked at MobilityDB (although I will now). In addition to
> PostGIS I'm using TimeScaleDB to build "continuous aggregates" of the
> GPS data in order to get things like the latest position per hour, the
> average speed per day etc. It's pretty handy for that but is unable to
> use more complex aggregates with window functions, distinct,
> trajectory creation etc.
>
> I suppose the main part of my query would be portion that distributes
> the difference between the lead/lag and current event_times (time
> delta) over the grids that the lead/lag created line pass over:
> sum(((st_length(st_intersection(traj.traj, grid.geom)) *
> traj.time_delta) / traj.traj_dist))
> Does MobilityDB have a function that could help with this?
>
> I tried to stay away from using the trajectory data type since it is
> only (x,y,t) and what I really need is (x,y,t,m1,m2,m3....) so that I
> can distribute the speed/bearing/dataX over the hexagon weighted with
> the time associated with each vertex.
>
> Regards,
> Rory
>
> ------------------------------------------------------------------------
> *From:* Florian Nadler <florian.nadler at cybertec.at>
> *Sent:* 24 February 2022 07:45
> *To:* PostGIS Users Discussion <postgis-users at lists.osgeo.org>; Rory
> Meyer <rory.meyer at VLIZ.be>
> *Subject:* Re: [postgis-users] Non-linear time cost; please suggest a
> better way to structure the query
>
> Hi,
>
>
> apart from Paul advice did you ever take into consideration to use
> MobilityDB for this kind of spatial questions?
>
>
> This will imply creating trajectories out of gps points too, but might
> simplify query design and processing time as this extension is
> developed for this kind of queries.
>
>
> Checkout
> https://docs.mobilitydb.com/MobilityDB-BerlinMOD/master/mobilitydb-berlinmod.pdf
> <https://docs.mobilitydb.com/MobilityDB-BerlinMOD/master/mobilitydb-berlinmod.pdf>,
> chapter 3.3 which deals with quite simliar analysis.
>
>
> Florian
>
>
> Am 23.02.2022 um 15:14 schrieb Rory Meyer:
>> Afternoon all,
>>
>> I've got a database full of GPS points (along with associated data
>> like speed, bearing, time, GPS_ID, class etc) and I'm trying to do
>> complex aggregations with the data. I'm trying to build up a
>> "heatmap" of the data by first creating a grid of polygons
>> using ST_HexagonGrid and then using a window function to overlay the
>> lines (created from a lead/lag window of each GPS point and the next
>> one from the same GPS_ID) over the grid. I'd like to get the the
>> number of seconds that gps carrying vehicles spend in each hex cell,
>> grouped by class, speed, date, direction etc etc. The end goal would
>> be to query a lon/lat and get a bunch of aggregated data for
>> different classes, speed/bearing distributions.
>>
>> Here's a simplified look at the SQL (sorry, it's not really simple...):
>>
>> '''
>> SELECT
>> grid.gid,
>> grid.geom,
>> avg(traj.bearing, 511.0) AS avg_bearing,
>> avg(traj.time_delta) AS avg_time_delta,
>> sum(((st_length(st_intersection(traj.traj, grid.geom)) *
>> traj.time_delta) / traj.traj_dist)) AS cum_time_in_grid
>> FROM (
>> (my_hex_grid AS grid LEFT JOIN ( SELECT
>> subquery.gps_id,
>> subquery.event_date,
>> subquery.bearing,
>> subquery.time_delta,
>> st_makeline(subquery.pos, subquery.pos2) AS traj,
>> st_distance(subquery.pos, subquery.pos2) AS traj_dist
>> FROM (
>> SELECT
>> gps.mmsi,
>> date_part('epoch'::text, (lead(gps.event_time) OVER time_order -
>> gps.event_time)) AS time_delta,
>> gps.geom,
>> gps.bearing,
>> lead(gps.geom) OVER time_order AS geom2
>> FROM gps
>> WHERE ((gps.event_time >= '<Start Time>') AND (gps.event_time <=
>> '<End Time>'))
>> WINDOW time_order AS (PARTITION BY gps.gps_id ORDER BY
>> gps.event_time)) as subquery
>> ON (st_intersects(gps.traj,
>> grid.geom)))
>> GROUP BY grid.gid, grid.geom
>> '''
>>
>> My issue is that I've got a non-linear increase in time that the
>> query takes to complete. If <Start Time> to <End Time> is a couple of
>> hours then it's takes a couple of seconds to run. If it's for a day,
>> it takes a couple minutes to run. If it's for a week it takes a
>> couple of hours.
>>
>> I'd like to run this for over a year of data but that won't be
>> feasible at this rate.
>>
>> Is there some way to avoid this non-linear increase in time or would
>> it be best to just write some python code to loop through smaller
>> chunks of data and write the results somewhere?
>>
>> Regards,
>> Rory
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org>
>> https://lists.osgeo.org/mailman/listinfo/postgis-users <https://lists.osgeo.org/mailman/listinfo/postgis-users>
> --
> CYBERTEC PostgreSQL International GmbH
> Römerstraße 19, A-2752 Wöllersdorf
> Web:https://www.cybertec-postgresql.com <https://www.cybertec-postgresql.com>
--
CYBERTEC PostgreSQL International GmbH
Römerstraße 19, A-2752 Wöllersdorf
Web:https://www.cybertec-postgresql.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20220226/2bfcd9f9/attachment.html>
More information about the postgis-users
mailing list