[postgis-users] data base structure for holding climate data

Brent Wood pcreso at pcreso.com
Mon Oct 2 12:22:37 PDT 2006



--- "Kirk R. Wythers" <kwythers at umn.edu> wrote:

> I am curious as to the consensus approach to handling dates in  
> postgis? I have gathered up a fairly large collection of climate  
> data. The date fields however, came parsed in the form  
> (year,month,day). I want to set up a postgres database to hold this  
> data with a location structure along the lines of:


Hi Kirk, 

Nice to see you're really moving with this stuff :-)

We use Oracle, but comparable structures using the native time datatype. We are
also using Informix, which is uniquely suited to such data as it not olnly has
a spatial datablade (similar to PostGIS) but has a timeseries datablade which
does for timeseries data pretty much what PostGIS does for spatial data.

However, for PostGIS, I suggest you look at a timestamp for identifying just
when a record was captured. For queries & aggregating (eg: to extract avg
daily/weekly/monthly temps) it will be useful to de-normalise & also store
separately indexed componenets of the timestamp, such as date, week of year,
month of year etc...

One other aspect will depend on your data. We have a PostGIS table with some
200,000,000 instrument readings, using a clustered index on timestamp, and
comprising yearly partitions. The raw data stores the actual time of data
capture, but the various instruments are not syncronised, so it is not possible
to extract a snapshot of them all at a common time.

We post process the data to generate a table (the 200,000,000 record one) which

has a record for every 1 min interval, with each reading where the reading is
the latest one within that 1 minute interval.

Some what impressively, a self-relation across this table with 25 readings,
retrieving 3 months data at 10 min intervals only takes 25 seconds on a desktop
PC.

To facilitate extarction at different intervals, each record also has an
attribute of 1, 2, 5, 10, 30, 60, 120, 1440 (based on minutes) so we can get
two-hourly values by using "where time_flag=120")


Hope this helps....

   Brent

> mn_climate=# \d sites
>                  Table "public.sites"
>       Column     |         Type          | Modifiers
> ----------------+-----------------------+-----------
> site_id        | integer               |
> site_shortname | character varying(8)  |
> site_name      | character varying(50) |
> lat            | numeric               |
> lon            | numeric               |
> utm_e          | numeric               |
> utm_n          | numeric               |
> Indexes:
>      "sites_site_id_key" UNIQUE, btree (site_id)
> 
> 
> And tie this "site" table to a second "clim" table that holds the  
> actual meteorological data. However, I am inclined to concatenate the  
> year, month, day data into a single "date" data type of the form year- 
> month-day-hour-minute-second (yyyy-mm-dd-hh-mm-ss). Does this make  
> sense to you all? If anyone can re-direct me before I do something  
> dumb in way I set this up, I would appreciate it.
> 
> Thanks,
> 
> Kirk
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
> 




More information about the postgis-users mailing list