[postgis-users] Creating trajectory/lines from millions of points[PostGIS]

toni hernández toni at sigte.udg.edu
Tue Dec 2 02:23:57 PST 2014


I am going to change the query to CTE structure to make it more 
readable. To know more about CTE take a look at 
http://www.postgresql.org/docs/8.4/static/queries-with.html

The CTE query is then like:

with
--first part
status_flag as(
select vehicle_id, geom, time, taxi_is_occupied, case when 
(taxi_is_occupied <> lag(taxi_is_occupied,1) over(order by time)) then 1 
else 0 end as taxi_status_change from gpx
)
,
--second part
track_group as (
select vehicle_id,geom, time, taxi_is_occupied, sum(taxi_status_change) 
over (order by time) as track_num from status_flag
)

--third part
select vehicle_id, st_makeline(geom), track_num, taxi_is_occupied from 
track_group
group by track_num,taxi_is_occupied, vehicle_id order by track_num



Yes. With the first part of the query (status_flag) you get the same 
amount of raws but with an extra column "taxi_status_change".
This extra column detects when a change in the taxi_is_occupied value 
occurs.


The second part (track_group), keeps the extra column but changes its 
value. The new value makes it possible to group (in the third part of 
the query) all the consecutive points with the same taxi_is_occupied value.



On 02/12/2014 08:59, Oliver Burgfeld wrote:
> Hi and thanks, that's exactly what I was looking for.
>
> But nevertheless it seems not to work as it's supposed to, I think.
>
> If I just run the first query to detect when the status changes, I get 
> a really strange result.
> I have at least the same amount of rows as in my original table.
> But the status does not change at every point of course, so the result 
> has to be lower. As I have only very basic knowledge of SQL, I really 
> don't know how to handle this strange behaviour.
>
> Or did I miss a thing here?
>
>
> Am Montag, 1. Dezember 2014 10:50:09 UTC+1 schrieb toni hernández:
>
>     Hi,
>
>     Maybe this is already solved but anyway....
>
>     As the "taxi_is_occupied" field  is boolean (or binary), if you
>     group by this field you will get only two multilinestrings for
>     each vehicle. One multilinestring when "taxi_is_occupied" is true,
>     and one multilinestring when is false.
>
>     If you want to get as many geometries as clients a taxi has done, 
>     then you can use the window functions "LAG". With this function
>     you can detect when the taxi changes from free to occupied and
>     viceversa
>
>     This sentence detects when taxi changes status:
>     select geom, time, status, case when (status <> lag(status,1)
>     over(order by time)) then 1 else 0 end as canvi from table
>
>     From there you can count the number of times the
>     "taxi_is_occupied" has changed.
>
>     select geom, time, status, sum(canvi) over (order by time) as
>     track_num from
>     (
>     select geom, time, status, case when (status <> lag(status,1)
>     over(order by time)) then 1 else 0 end as canvi from table
>     ) as foo
>
>
>     And finally , you can use the previous SELECT statement to create
>     all tracks for all taxis
>
>     select vehicleid, st_makeline(geom), track_num, status
>     from
>     (
>     select vehicleid,geom, time, status, sum(canvi) over (order by
>     time) as track_num from
>     (
>     select vehicleid, geom, time, status, case when (status <>
>     lag(status,1) over(order by time)) then 1 else 0 end as canvi from
>     table
>     ) as foo
>     ) as fooo
>     group by track_num, status, vehicleid
>     order by track_num
>
>
>
>     On 25/11/2014 19:16, Roxanne Reid-Bennett wrote:
>
>         On 11/25/2014 11:48 AM, Oliver Burgfeld wrote:
>
>             I also tried that and it works but it does not give me
>             those two columns in my new table. There are only id and
>             status inside.
>
>         From the quer below change
>         SELECT id, status, (ST_Dump(mylines)).geom
>         to
>
>         SELECT id, status, (ST_Dump(mylines)).geom, time_start, time_end
>
>         Something to keep an eye out for... Depending upon your
>         version of PostGIS and the underlying libraries, because we
>         were working with an older version of the underlying
>         libraries, I don't know if this is still a potential issue. 
>         We ran into issues with "stacked" points (2 GPS points with
>         the same coordinates), and GPS drift causing issues with the
>         linestrings being pretty goofy (jagged points in the
>         linestring that were clearly NOT what the vehicle did). We
>         wrote cleanup scripts to take care of those issues (that are
>         still in place).  I've not taken time to revisit the library
>         routines to see if they now handle those conditions cleanly.
>
>         Roxanne
>
>
>             Am Dienstag, 25. November 2014 17:39:21 UTC+1 schrieb
>             Brent Wood:
>
>                 as in my previous reply, I figured that would be useful...
>                 WITH multis AS (
>                 SELECT id, status,*min(timestamp) as time_start,
>                 max(timestamp) as time_end, *ST_MakeLine( point_geom
>                 ORDER BY timestamp) AS mylines
>                 FROM your_table
>                 GROUP BY id, status
>                 )
>                 SELECT id, status, (ST_Dump(mylines)).geom
>                 FROM multisBrent Wood
>                 ------------------------------------------------------------------------
>                 *From:* Oliver Burgfeld <oliver.... at gmail.com>
>                 *To:* postgi... at googlegroups.com
>                 *Cc:* pcr... at pcreso.com; postgi... at lists.osgeo.org;
>                 postgi... at lists.osgeo.org; remi... at gmail.com
>                 *Sent:* Wednesday, November 26, 2014 5:10 AM
>                 *Subject:* Re: [postgis-users] Creating
>                 trajectory/lines from millions of points[PostGIS]
>
>                 Thank you and all the others who were answering :)
>
>                 I tried that and it seems that its working.
>                 Nevertheless I only tried it with a small part of my
>                 data (round about 1 million rows out of ~500 million)
>                 but if it's working now, it should also work with the
>                 whole dataset.
>
>                 Is there a way to also include the time_field into the
>                 result? I created a new table with this statement
>                 given but there are only two columns (vehicleid and
>                 status) included.
>                 I know thats logical because I only included those two
>                 into my select clause but it would be great to not
>                 only order by time but also have a time column in my
>                 table.
>
>                 For example:
>
>                 vehicleid | status | time_start | time_end
>
>
>                 I hope its understandable and not to mixed up...
>
>                 Thanks!
>
>
>                 Am Dienstag, 25. November 2014 16:06:33 UTC+1 schrieb
>                 Rémi Cura:
>
>                     Hey, a small correction :
>                     ST_MakeLine is already an aggregate, and you may
>                     want to enforce the order inside the aggregate
>                     (see at the end).
>                     Another interesting point is the possiblity to pu
>                     somehting in the M value of each point of the
>                     line, for instance the time.
>                     This comes very handy when you want to extrat
>                     parts of the lines.
>
>
>                     So for instance for the first proposition :
>
>                     WITH multis AS (
>                     SELECT id, status,*ST_MakeLine( point_**geom ORDER
>                     BY time_field) *AS mylines
>                     FROM your_table
>                     GROUP BY id, status
>                     )
>                     SELECT id, status, (ST_Dump(mylines)).geom
>                     FROM multis
>
>                     Cheers,
>                     Rémi-c
>
>
>                     2014-11-25 9:53 GMT+01:00 Brent Wood
>                     <pcr... at pcreso.com>:
>
>                         or automatically get the start & end times for
>                         each trackline in the record like this:
>
>                         WITH multis AS (
>                         SELECT id, min(time_field) AS time_start,
>                         max(time_field) as time_end, status,
>                         ST_MakeLine(array_agg(point_ geom )) AS mylines
>                         FROM your_table
>                         GROUP BY id, status
>                         ORDER BY time_field
>                         )
>                         SELECT id, status, (ST_Dump(mylines)).geom
>                         FROM multis;
>
>
>                         Cheers,
>                            Brent Wood
>                         ------------------------------------------------------------------------
>                         *From:* Hugues François <hugues.... at irstea.fr>
>                         *To:* PostGIS Users Discussion
>                         <postgi... at lists.osgeo.org >
>                         *Sent:* Tuesday, November 25, 2014 8:13 PM
>                         *Subject:* Re: [postgis-users] Creating
>                         trajectory/lines from millions of points[PostGIS]
>
>                         Hello,
>                         In your case I would have try to make
>                         multilines for each taxi and each status (i.e.
>                         two multi by taxi) and then dump them into
>                         simple linestrings. All in a query that may
>                         look like this assuming you have a taxi id field:
>                         WITH multis AS (
>                         SELECT id, status,
>                         ST_MakeLine(array_agg(point_ geom )) AS mylines
>                         FROM your_table
>                         GROUP BY id, status
>                         ORDER BY time_field
>                         )
>                         SELECT id, status, (ST_Dump(mylines)).geom
>                         FROM multis
>                         You may want to add a time reference to your
>                         lines. To do this, you can add an extraction
>                         from your timestamp field (e.g. day or month)
>                         and add it into the WITH and to the group by
>                         clause.
>                         Hugues.
>                         *De :*postgis-us... at lists. osgeo.org
>                         [mailto:postgis-us...@ lists.osgeo.org] *De la
>                         part de* Oliver Burgfeld
>                         *Envoyé :* mardi 25 novembre 2014 07:09
>                         *À :* postgi... at lists.osgeo.org
>                         *Objet :* [postgis-users] Creating
>                         trajectory/lines from millions of points[PostGIS]
>
>
>                         Hi,
>                         I have millions of points in a PostGIS
>                         database containing taxi gps tracks. Now I
>                         want to create lines from these points by
>                         vehicleid and ordered by timestamp. But, and
>                         that's my problem right now, at first I want
>                         to include every column of my point table into
>                         the "line table" and I also need to intersect
>                         those lines at specific points.
>                         I have one column representing the
>                         "taxi_is_occupied" status with 0 or 1.
>                         What I want now is to create lines which are
>                         divided every time this status changes. In the
>                         end I need lines which show the path of every
>                         taxi over time, divided every time the status
>                         of the car changes so that I can query all
>                         lines where the taxi is occupied, for example.
>                         What do I have to use therefore? I know that
>                         there is the ST_MakeLines tool existing in
>                         PostGIS, but as I am a new PostGIS
>
>     ...
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20141202/fa3fe00c/attachment.html>


More information about the postgis-users mailing list