<html>

<head>

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=US-ASCII">

<meta name=Generator content="Microsoft Word 11 (filtered)">

<style>

<!--

 /* Font Definitions */

 @font-face

        {font-family:Tahoma;

        panose-1:2 11 6 4 3 5 4 4 2 4;}

 /* Style Definitions */

 p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0in;

        margin-bottom:.0001pt;

        font-size:12.0pt;

        font-family:"Times New Roman";}

a:link, span.MsoHyperlink

        {color:blue;

        text-decoration:underline;}

a:visited, span.MsoHyperlinkFollowed

        {color:purple;

        text-decoration:underline;}

span.EmailStyle17

        {font-family:Arial;

        color:navy;}

@page Section1

        {size:8.5in 11.0in;

        margin:1.0in 1.25in 1.0in 1.25in;}

div.Section1

        {page:Section1;}

-->

</style>

</head>

<body bgcolor=white lang=EN-US link=blue vlink=purple>

<div class=Section1>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>Hi, all:</span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'> </span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>I think I would be tarred and feathered if

I didn’t chime in.  We deal w/ near real-time ocean observations,

but it is the remote sensing and model forecast data products that seem to

align most closely w/ this thread.  I have to admit that I skimmed the

emails, and I’m not sure if the data you’re attempting to store is

to be used for visualization or data mining, but maybe our application of

PostGIS can apply to both.</span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'> </span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>Let’s take the model forecast

product as an example.  We have at least two hurdles to overcome: 

our model data is about 5 days worth of water level, currents, winds, air

pressure for the SE US Atlantic.  It is refreshed daily and is over 10 million

points of data.  It all starts out as netCDF, but leaving it in that form

to produce images and animations and time series would be killer.  We’re

entering y3 of our efforts, and I think we’ve got a reasonable solution

in place.</span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'> </span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>Since our model data is hourly, I break

the data into one table per hour and index each table by space.  This

works well from an interface point of view since a user can only look at an

hour snapshot of all our data.  It’s a little more of a problem to

look at something like time series since that requires me to JOIN the tables to

produce a linearly aggregated product.  But it’s still

efficient.  At least much more efficient than my first try at keeping all

the data in one table, indexing it by space and time, and then CLISTER-ing

it.  That was a mess and didn’t help much in the end.</span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'> </span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>In addition to breaking out the products

by time, I create different aggregations of the data by granularity. 

Keeping in mind that we are primarily driven by producing snappy visualizations,

I decided that I’d break up the data further into 5 different zoom

levels, i.e. levels of granularity.  If the user is looking at the entire SE US, there is no need to have the source data finely granular – simply hit the

tables that contain data appropriate for that extent.</span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'> </span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>To recap:  We have to consider the

fact that our data is updated near real-time, so index creation has to be consistent

and relatively painless.  I’ve established a round-robining scheme

such that while machine A is being updated, machine B (which has a copy of

machine A’s data) accepts DB queries.  Then when machine A is done,

queries are returned to A’s DB, and then B’s DB is updated. 

The tables are created as hourly snapshots of the data.  And then they are

broken down into 5 zoom levels (at least the remotely sensed data follows this

pattern).</span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'> </span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>To get a taste of what I’m talking

about, look here:</span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'><a

href="http://nautilus.baruch.sc.edu/seacoos_misc/show_sea_coos_obs_time_ranges.php">http://nautilus.baruch.sc.edu/seacoos_misc/show_sea_coos_obs_time_ranges.php</a>.</span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'> </span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>The MODEL data layers (last two on that

page) and the QuikSCAT wind follows the pattern I describe above.</span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'> </span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>Charlton</span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'> </span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'> </span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'> </span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'> </span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>Charlton Purvis</span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>(803) 777-8858 : voice</span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>(803) 777-3935 : fax</span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>cpurvis@sc.edu</span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'> </span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>Baruch Institute</span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

  10.0pt;font-family:Arial;color:navy'>University</span></font><font size=2

 color=navy face=Arial><span style='font-size:10.0pt;font-family:Arial;

 color:navy'> of South Carolina</span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

  10.0pt;font-family:Arial;color:navy'>Columbia</span></font><font size=2

 color=navy face=Arial><span style='font-size:10.0pt;font-family:Arial;

 color:navy'>, SC 29208</span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'> </span></font></p>

<div style='border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt'>

<div>

<div class=MsoNormal align=center style='text-align:center'><font size=3

face="Times New Roman"><span style='font-size:12.0pt'>

<hr size=2 width="100%" align=center tabindex=-1>

</span></font></div>

<p class=MsoNormal><b><font size=2 face=Tahoma><span style='font-size:10.0pt;

font-family:Tahoma;font-weight:bold'>From:</span></font></b><font size=2

face=Tahoma><span style='font-size:10.0pt;font-family:Tahoma'> Yves Moisan

[mailto:ymoisan@groupesm.com] <br>

<b><span style='font-weight:bold'>Sent:</span></b> Monday, November 15, 2004

8:43 AM<br>

<b><span style='font-weight:bold'>To:</span></b>

postgis-users@postgis.refractions.net<br>

<b><span style='font-weight:bold'>Subject:</span></b> [postgis-users] Re:

Massive Lidar Dataset Datatype Suggestions?</span></font></p>

</div>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:

12.0pt'> </span></font></p>

<div>

<div>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:

12.0pt'> Hi, <br>

<br>

I am also still pondering how the heck I will be storing potentially large

amounts of water quality [point] data.  Integrating on space as Paul

suggests is interesting, but other integration schemes could be useful, one

being integration of the data "by object" (e.g. sensor, station ...).

<br>

<br>

In the example I am thinking of, a bunch of point data could be boxed both by

time and sensor in the form of a single netCDF file (integration on

object=sensor) for an arbitrary time bin (e.g. a day, a week ...). <br>

<br>

I am still very hesitant as to what path is best.  Wouldn't a netCDF file

allow me to put all the relevant metadata as well that I could make sure meets

some standard (e.g. FDGC-CSDGM) instead of potentially having to put that

metadata in postgreSQL or an XML database ?  Would the spatial querying

machinery be efficient if the data were stored in netCDF files, e.g. could I

still use just the coordinates of my data points in postGIS with a 3rd field

being some sort of pointer to a BLOB in the form of a netCDF file ?  I

think if it is just for spatial queries, such a set up would be fine.  But

what if I wanted to further parametrize my queries by some attribute data (e.g.

give me all point measurements < valueOrParameter=A > valueOrParameter=B)

?  I guess depending on volume netCDF files could be opened from within

postgreSQL without it being too heavy an operation ? <br>

<br>

Your problem is one of sheer data volume and calls for some integration

mechanism, but I think one doesn't have to have a data volume problem to

realize that data integration is, in my opinion, a much more general problem

for all of us. <br>

<br>

Let us know what solution you chose.  I am too very much interested. <br>

<br>

Yves Moisan <br>

<br>

Gerry Creager N5JXS wrote: <br>

<br>

</span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;

font-family:Arial'>Hmmm... Can we start thinking in terms of a NetCDF data

structure? </span></font></p>

</div>

</div>

</div>

</div>

</body>

</html>