[postgis-users] PL/PGSQL for Data Smoothing

David William Bitner david.bitner at gmail.com
Mon Mar 16 11:32:06 PDT 2009

Hey all,

I've been implementing a couple PL/PGSQL routines to help clean up some of
our data.  The data I am cleaning up is coming from a multilateration system
for aircraft flight tracks.  The nature of this type of system is that there
are a number of reflections and other artifacts that make their way into the
data causing very jagged tracks.  As this particular data source has one
data point / second, getting the data down to a manageable size is also a
concern.  Below are a couple of PL/PGSQL functions that I use in series that
seem to be a good compromise for removing outliers, smoothing the data, and
generalizing. I thought that this may either be of interest to others or
that someone else may have some ideas for refinement or performance
improvements.  It's still a work in progress, but I plan on documenting on
the wiki once I get things a bit further.

Outlier removal:
Due to the nature of the data, I know an aircraft cannot make an extremely
sharp turn within the one second data refresh time. When I get acute angles
showing up in the data I have found that it is a fairly safe bet that the
point at that acute angle is likely an outlier.  This script allows for the
removal of those points with a tolerance of the difference in degrees
between the angles of the two segments coming to a point with respect to the
x axis (I can draw a picture if anyone is really interested).

CREATE OR REPLACE FUNCTION scrub_angle(geometry, integer)
  RETURNS geometry AS
select makeline(p2) from (
 generate_series(1,ST_NUMPOINTS($1)) n,
ST_POINTN($1,generate_series(1,ST_NUMPOINTS($1))-1) p1,
 ST_POINTN($1,generate_series(1,ST_NUMPOINTS($1))) p2,
ST_POINTN($1,generate_series(1,ST_NUMPOINTS($1))+1) p3
) as bar
where abs(round(degrees(ST_AZIMUTH(p1,p2)-ST_AZIMUTH(p2,p3))))::int%180< $2
or p1 is null or p3 is null order by n asc) as baz;

  COST 100;

Data Smoothing:
This is a variation (at least in idea) of the McMaster Distance Weighted
Smoothing Algorithm.  Due to the nature of the data having M values
throughout, it is weighted by time difference rather than distance.  It also
places a cap (hard coded to 200 right now) in the distance that any point
will be moved.

  RETURNS geometry AS
$BODY$SELECT makeline(pt) from (
n,case when (n between 2 and t-2) and m(p1)<m(p2) and m(p2) < m(p3) and
m(p3)<m(p4) and m(p4)<m(p5) then setsrid(makepoint(
),26915) else setsrid(p3,26915) end as pt
generate_series(1,ST_NUMPOINTS($1)) n,
 st_numpoints($1) t,
ST_POINTN($1,generate_series(1,ST_NUMPOINTS($1))-2) p1,
ST_POINTN($1,generate_series(1,ST_NUMPOINTS($1))-1) p2,
 ST_POINTN($1,generate_series(1,ST_NUMPOINTS($1))) p3,
ST_POINTN($1,generate_series(1,ST_NUMPOINTS($1))+1) p4,
 ST_POINTN($1,generate_series(1,ST_NUMPOINTS($1))+2) p5
) as bar
 order by n asc) as baz; $BODY$
  COST 100;

With the type of data that I am using I am finding that putting everything
together in outlier remove->smooth->simplify leaves me with very
aesthetically "correct" looking tracks while also maintaining actual
accuracy as measured against GPS derived dataset that I have available for
comparison for a limited set of the data.  Right now, I am using this in a
nested fashion like: select
from table .... The nested scrub_angle calls help to add a bit of recursion
for when the removal of one point leaves another sharp angle.  So far in my
testing this allows for a six-fold reduction in the number of points and
cleaner looking tracks while not degrading the RMS value when comparing both
the cleaned data and the original data to the reference data where
available.  Hope this is of interest to somebody.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20090316/e5078397/attachment.html>

More information about the postgis-users mailing list