<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
{font-family:"Comic Sans MS";
panose-1:3 15 7 2 3 3 2 2 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-reply;
font-family:"Comic Sans MS";
color:windowtext;
font-weight:normal;
font-style:normal;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:8.0pt;font-family:"Comic Sans MS"">Hi Brent,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:8.0pt;font-family:"Comic Sans MS""><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:8.0pt;font-family:"Comic Sans MS"">Very interesting. I’m getting ready to open source my product as a configuration set of MapServer, GeoMoose OpenLayers and Postgres/PostGIS, maybe some room for cross pollination
here.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:8.0pt;font-family:"Comic Sans MS""><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:8.0pt;font-family:"Comic Sans MS"">We’ve just started collecting data (compared to your description) about 8 months ago. Our installation is slightly different in that it handles up to 300+ vehicle tracks at a time,
with average daily numbers at 120-150 trails. We’re averaging 1 million new records a month, with snow emergencies increasing that number to 2 to 3 time. We’re just short of 19 million records which are collected about every 15-20 seconds. The reporting
is staggered from the field to reduce overloading. We have a few vehicles running at 1 & 3 sec reporting times, which are a whole other deal. This may be more the norm moving forward.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:8.0pt;font-family:"Comic Sans MS""><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:8.0pt;font-family:"Comic Sans MS"">We also have a variety of sensors integrated into each device that is transmitted back, so the table structure gets a bit more complicated, but still, we think, in a flexible and
manageable manner.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:8.0pt;font-family:"Comic Sans MS""><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:8.0pt;font-family:"Comic Sans MS"">I haven’t actually gotten to the point of separating anything out yet. Everything is running off the raw table, but I know I’ll need to figure something out eventually.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:8.0pt;font-family:"Comic Sans MS""><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:8.0pt;font-family:"Comic Sans MS"">We have very similar usage needs as well, most all queries will be time related which so far has been fine until more than a few days are queried from (100 records or so). Our
desired response times are in the 3-4 sec range and built around the idea of doing on the fly reporting via the web. I understand that I may need to set up some predefined long running functions at some point, but I’ve avoided it so far.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:8.0pt;font-family:"Comic Sans MS""><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:8.0pt;font-family:"Comic Sans MS"">I’ve pondered all sorts of approaches, one that’s probably overkill but very automatable would be to use a table domain structure per vehicle, separate tables for collecting per
vehicle. This allows for all sorts of control and granularity down to the individual asset, but seems like overkill on the surface. It start to get into some interesting items with respect to adding new assets automatically as well.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:8.0pt;font-family:"Comic Sans MS""><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:8.0pt;font-family:"Comic Sans MS"">Anyway enough about us. how might we make our systems complement each other?<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:8.0pt;font-family:"Comic Sans MS""><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:8.0pt;font-family:"Comic Sans MS"">Bobb<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:8.0pt;font-family:"Comic Sans MS""><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:8.0pt;font-family:"Comic Sans MS""><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:8.0pt;font-family:"Comic Sans MS""><o:p> </o:p></span></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> pcreso@pcreso.com [mailto:pcreso@pcreso.com]
<br>
<b>Sent:</b> Friday, March 01, 2013 6:06 PM<br>
<b>To:</b> PostGIS Users Discussion<br>
<b>Cc:</b> Basques, Bob (CI-StPaul)<br>
<b>Subject:</b> Re: [postgis-users] design problem<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<table class="MsoNormalTable" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td valign="top" style="padding:0in 0in 0in 0in">
<p class="MsoNormal">Hi Bob,<br>
<br>
This may be of interest.<br>
<br>
We do exactly this with some 350,000,000 million (& growing) GPS records for instrument readings from the research vessel Tangaroa. Every reading has both a time & a Postgis position value. For ease of plotting, the points for each day are also aggregated into
a linestring, the daily vessel track.<br>
<br>
Each year is a new table, which is a partition in the parent. As users are normally querying data within a specified interval, the historic partitions also use a clustered index on timestamp, so that blocks read from disk are likely to include multiple required
records in each read, further improving performance. The current year's partition does not have a clustered index, so inserts do not require the index to be rebuilt.<br>
<br>
We have added a further field to ease user access. Shorter period queries are usually for readings at a fine resolution (1-10 minutes), long period (20 years say) may want daily or hourly readings.<br>
<br>
It has worked very well for us for several years now, & we have a GUI sitting on top of it with mapserver enabling a map based view of the vessel track, optionally coloured by the selected reading. The data is also downloadable at the users selected interval
over the specified period. This is not formally released as Open Source, but we have no problems providing anyone the source if they are interested. It was developed by an external contractor, so support is also available if useful. Ditto the db documentation.<br>
<br>
The underlying raw data are in netCDF tables generated by the IFREMER TECHSAS application. The Postgis summary tables, which are adequate for almost all user purposes, are at 1 minute granularity. Records with a timestamp at 1 minute (eg: 12:01) have a timer
value of 1, 2 minutes (12:02; 12:04; 12:06; 12:08) are 2, 5 minutes (12:05) are 4, 10 minutes (12:10; 12:20) are 8, 15 minutes (12:15; 12:45) are 16, 30 minutes are 32, 1 hr (01:00; 03:00; ...) are 64, 2 hr (02:00; 04:00; ...) are 128, 12 hr (12:00) are 256,
midnight (00:00) are 512 (or something like this approach).<br>
<br>
A user can then get 12 hourly records simply by selecting where (timer = 256 or timer = 512) for a very fast result. A five minute result by where (timer >= 4).
<br>
<br>
10 minutes is a bit trickier, because users wanted a 15 minute option, but not hard: where (timer >=8 and timer != 16).<br>
<br>
This was developed to meet our user needs, & may or may not be useful for your purposes...
<br>
<br>
HTH,<br>
<br>
Brent Wood<br>
<br>
--- On <b>Sat, 3/2/13, Basques, Bob (CI-StPaul) <i><<a href="mailto:bob.basques@ci.stpaul.mn.us">bob.basques@ci.stpaul.mn.us</a>></i></b> wrote:<o:p></o:p></p>
<p class="MsoNormal" style="margin-bottom:12.0pt"><br>
From: Basques, Bob (CI-StPaul) <<a href="mailto:bob.basques@ci.stpaul.mn.us">bob.basques@ci.stpaul.mn.us</a>><br>
Subject: Re: [postgis-users] design problem<br>
To: "PostGIS Users Discussion" <<a href="mailto:postgis-users@lists.osgeo.org">postgis-users@lists.osgeo.org</a>><br>
Date: Saturday, March 2, 2013, 11:31 AM<o:p></o:p></p>
<div>
<p class="MsoNormal">Steve,<br>
<br>
Could this process of inherited tables be used to roll up GPS data by time increments for example? Maybe roll up the data by day, week or month? I need to figure out a way to handle queries potentially against millions of records for reporting purposes.<br>
<br>
Bobb<br>
<br>
<br>
<br>
>> -----Original Message-----<br>
>> From: <a href="/mc/compose?to=postgis-users-bounces@lists.osgeo.org">postgis-users-bounces@lists.osgeo.org</a> [mailto:postgis-<br>
>> <a href="/mc/compose?to=users-bounces@lists.osgeo.org">users-bounces@lists.osgeo.org</a>] On Behalf Of Stephen Woodbridge<br>
>> Sent: Friday, March 01, 2013 4:23 PM<br>
>> To: <a href="/mc/compose?to=postgis-users@lists.osgeo.org">postgis-users@lists.osgeo.org</a><br>
>> Subject: Re: [postgis-users] design problem<br>
>> <br>
>> On 3/1/2013 4:44 PM, Andy Colson wrote:<br>
>> > On 3/1/2013 3:11 PM, Denise Janson wrote:<br>
>> >> Hi,<br>
>> >><br>
>> >> I have an application that is going to receive lots of<br>
>> georeferenced<br>
>> >> files every day. Each file has information of several points.<br>
>> >> Probably in few years my application will have a Tera of<br>
>> points<br>
>> >> information stored.<br>
>> >><br>
>> >> I think I can do this design in two ways:<br>
>> >><br>
>> >> 1. Two tables, one of “uploaded_files”, and another of<br>
>> “points” (one<br>
>> >> uploadedFile to N points). And I'll have to partition the<br>
>> points<br>
>> >> table, maybe by month … 2. Or I can create one table per file,<br>
>> having<br>
>> >> thousands of tables in few years.<br>
>> >><br>
>> >> Which case is better for my application? Is there any better<br>
>> way to<br>
>> >> do this?<br>
>> >><br>
>> ><br>
>> > If performance is a concern, and the file's are of any<br>
>> meaningful<br>
>> > size, you might consider leaving them on the filesystem and<br>
>> have the<br>
>> > table point to it (full path name sort of thing).<br>
>> ><br>
>> > Storing the file in PG is possible, and its nice because<br>
>> everything is<br>
>> > kept together, but if you have to get to and read the files<br>
>> fast, then<br>
>> > leave them on the filesystem.<br>
>> ><br>
>> > The lots of tables approach is problematic if you ever want to<br>
>> write<br>
>> > queries that look back in time. Its much harder to say, give<br>
>> me every<br>
>> > record from the beginning of time at this point.<br>
>> ><br>
>> > With a good index, PG wont have a problem with a single table<br>
>> > containing billions of rows. Just try to avoid doing bulk<br>
>> operations<br>
>> > (like update and delete) on the entire table.<br>
>> ><br>
>> ><br>
>> > > uploadedFile to N points). And I'll have to partition the<br>
>> points<br>
>> > table,<br>
>> ><br>
>> ><br>
>> > Why will you have to partition it?<br>
>> <br>
>> you might want to consider using inherited tables. This you can<br>
>> have something like:<br>
>> <br>
>> master_table<br>
>> - table1 inherits from master_table<br>
>> - table2 inherits from master_table<br>
>> - etc<br>
>> <br>
>> This has the advantange that you can set constraints on the sub-<br>
>> tables like date_from, data_to or other constraints that you<br>
>> might need in your queries.<br>
>> <br>
>> Then when you make you query on the master_table if will<br>
>> eliminate all the tables that fail the constraint test and this<br>
>> is very fast. Also if you ever need to make adhoc queries on the<br>
>> master_table you still have a structure that supports that.<br>
>> <br>
>> There might ne some issues with inheriting 10 of 1000s of tables.<br>
>> <br>
>> The real answer to your design can only be answered by<br>
>> understanding what your queries are going to look like with<br>
>> respect to all this data.<br>
>> <br>
>> -Steve<br>
>> <br>
>> _______________________________________________<br>
>> postgis-users mailing list<br>
>> <a href="/mc/compose?to=postgis-users@lists.osgeo.org">postgis-users@lists.osgeo.org</a><br>
>> <a href="http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users" target="_blank">
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users </a><br>
_______________________________________________<br>
postgis-users mailing list<br>
<a href="/mc/compose?to=postgis-users@lists.osgeo.org">postgis-users@lists.osgeo.org</a><br>
<a href="http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users" target="_blank">http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users</a><o:p></o:p></p>
</div>
</td>
</tr>
</tbody>
</table>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Calibri","sans-serif""><o:p> </o:p></span></p>
</div>
</div>
</body>
</html>