<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
Afternoon all,</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
I've got a database full of GPS points (along with associated data like speed, bearing, time, GPS_ID, class etc) and I'm trying to do complex aggregations with the data. I'm trying to build up a "heatmap" of the data by first creating a grid of polygons using ST_HexagonGrid
and then using a window function to overlay the lines (created from a lead/lag window of each GPS point and the next one from the same GPS_ID) over the grid. I'd like to get the the number of seconds that gps carrying vehicles spend in each hex cell, grouped
by class, speed, date, direction etc etc. The end goal would be to query a lon/lat and get a bunch of aggregated data for different classes, speed/bearing distributions. </div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
Here's a simplified look at the SQL (sorry, it's not really simple...):</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
'''</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
SELECT
<div> grid.gid,</div>
<div> grid.geom,</div>
<div> avg(traj.bearing, 511.0) AS avg_bearing,</div>
<div> avg(traj.time_delta) AS avg_time_delta,</div>
<div> sum(((st_length(st_intersection(traj.traj, grid.geom)) * traj.time_delta) / traj.traj_dist)) AS cum_time_in_grid</div>
<div>FROM (</div>
<div>(my_hex_grid AS grid LEFT JOIN ( SELECT</div>
<div> subquery.gps_id,</div>
<div><span style="color: rgb(0, 0, 0); background-color: rgb(255, 255, 255); display: inline !important;"> </span>subquery.event_date,</div>
<div><span style="color: rgb(0, 0, 0); background-color: rgb(255, 255, 255); display: inline !important;"> </span>subquery.bearing,
</div>
<div><span style="color: rgb(0, 0, 0); background-color: rgb(255, 255, 255); display: inline !important;"> </span>subquery.time_delta,</div>
<div><span style="color: rgb(0, 0, 0); background-color: rgb(255, 255, 255); display: inline !important;"> </span>st_makeline(subquery.pos, subquery.pos2) AS traj,</div>
<div><span style="color: rgb(0, 0, 0); background-color: rgb(255, 255, 255); display: inline !important;"> </span>st_distance(subquery.pos, subquery.pos2) AS traj_dist</div>
<div><span style="color: rgb(0, 0, 0); background-color: rgb(255, 255, 255); display: inline !important;"> </span>FROM ( </div>
<div><span style="color: rgb(0, 0, 0); background-color: rgb(255, 255, 255); display: inline !important;"> </span>SELECT </div>
<div><span style="color: rgb(0, 0, 0); background-color: rgb(255, 255, 255); display: inline !important;"> </span>gps.mmsi,</div>
<div><span style="color: rgb(0, 0, 0); background-color: rgb(255, 255, 255); display: inline !important;"> </span>date_part('epoch'::text, (lead(gps.event_time) OVER time_order - gps.event_time)) AS
<span style="color: rgb(0, 0, 0); background-color: rgb(255, 255, 255); display: inline !important;">
</span>time_delta,</div>
<div><span style="color: rgb(0, 0, 0); background-color: rgb(255, 255, 255); display: inline !important;"> </span>gps.geom,</div>
<div><span style="color: rgb(0, 0, 0); background-color: rgb(255, 255, 255); display: inline !important;"> </span>gps.bearing,</div>
<div><span style="color: rgb(0, 0, 0); background-color: rgb(255, 255, 255); display: inline !important;"> </span>lead(gps.geom) OVER time_order AS geom2</div>
<div><span style="color: rgb(0, 0, 0); background-color: rgb(255, 255, 255); display: inline !important;"> </span> FROM gps</div>
<div><span style="color: rgb(0, 0, 0); background-color: rgb(255, 255, 255); display: inline !important;"> </span>WHERE ((gps.event_time >= '<Start Time>') AND (gps.event_time <= '<End Time>'))</div>
<div><span style="color: rgb(0, 0, 0); background-color: rgb(255, 255, 255); display: inline !important;"> </span>WINDOW time_order AS (PARTITION BY gps.gps_id ORDER BY gps.event_time)) as subquery</div>
<div> ON (st_intersects(gps.traj, grid.geom)))</div>
GROUP BY grid.gid, grid.geom</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
'''</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
My issue is that I've got a non-linear increase in time that the query takes to complete. If <Start Time> to <End Time> is a couple of hours then it's takes a couple of seconds to run. If it's for a day, it takes a couple minutes to run. If it's for a week
it takes a couple of hours. </div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
I'd like to run this for over a year of data but that won't be feasible at this rate. </div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
Is there some way to avoid this non-linear increase in time or would it be best to just write some python code to loop through smaller chunks of data and write the results somewhere?</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
Regards,</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
Rory</div>
</body>
</html>