[gdal-dev] Time in GDAL

Edzer Pebesma edzer.pebesma at uni-muenster.de
Sun Oct 15 09:13:33 PDT 2017


Hi,

last week the openEO project [1] started, in which we aim to develop an
interface (api) to cloud-based processing of Earth observation (EO)
imagery. For this project, GDAL has been the blueprint in terms of
successfully integrating the diverse landscape of file formats [2]. The
project allocated a budget for subcontracting some GDAL development.

During discussions, it became clear that for practically everyone
involved, GDAL plays an important role, be it for ingesting images, or
for processing them on the fly. Also, a key requirement in all cases
seems to be to do something useful with time series of EO images. Here,
the current inability of GDAL to report the time associated with
datasets, subdatasets and/or bands was noted as a high potential area
for extending gdal. Currently, parsing this from the metadata strings is
possible but messy and ad hoc, error prone, and has to be done by each
client for every driver.

My proposal is to augment the GDAL interface to raster data with two
methods, GetStartTime() and GetEndTime(), which operate (at least) on
bands and report the start and end times of the data acquisition using a
simple interface, e.g. similar to poFeature->GetFieldAsDataTime [3] does
(returning IIRC either GMT or local time zone). Drivers should fill
these fields (which might be equal) or else a flag should indicate the
time is missing.

## Start and end time

Although most snapshot-based products may have identical start and end
times, many others don't; a lot of derived (e.g. climate) products give
daily or monthly averages, where start and end time differ.

## Multiple time stamps

Datasets might have many time stamps, e.g. some related to the
processing steps that a dataset has undergone. Other datasets, e.g.
forecast data, may have two relevant times (two-dimensional time): the
time at which the forecast was made (t0), and the forecast times the
band refers to (e.g. t0+6h, t0+12h, t0+18h, t0+24h etc). For both cases,
I believe there is a "default time": the time of observation or
prediction a band refers to. Access to other time aspects could be
obtained by tags as in GetStartTime(..., reference =
"TIME_OF_REPROCESSING"), which might be driver dependent.

## NetCDF/udunits time

Most file formats will have time strings like 2017-03-14T10:40:11.026Z
that should be pretty straightforward to handle. NetCDF however uses
time encoded in a form understood by udunits2 [4]; for example band
metadata may have

 time#units=days since 1978-01-01 00:00:00
 NETCDF_DIM_time=1339

which refers to 1339 days after 1978-01-01 00:00:00. Since the units can
be set very flexible, for such data I think one should either:
(i) have GDAL link (optionally) to udunits2, and if present, use the
library to convert to OFTDateTime;
(ii) do not try this but return the time units as string and the time as
double, and leave the conversion to the client.


I would like to know:
- whether there is support for this idea, in general (time in gdal),
- whether the approach sketched above makes sense
- what I've overlooked, what are the potential road blocks


[1] http://openeo.org/
[2] http://r-spatial.org/2016/11/29/openeo.html
[3]
http://www.gdal.org/classOGRFeature.html#a6c5d2444407b07e07b79863c42ee7a49
[4] https://www.unidata.ucar.edu/software/udunits/
-- 
Edzer Pebesma
Institute for Geoinformatics  (ifgi),  University of Münster
Heisenbergstraße 2, 48149 Münster, Germany; +49 251 83 33081
Journal of Statistical Software:   http://www.jstatsoft.org/
Computers & Geosciences:   http://elsevier.com/locate/cageo/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: OpenPGP digital signature
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20171015/aabb0385/attachment.sig>


More information about the gdal-dev mailing list