[GRASS-dev] what if: Anything series?

Markus Neteler neteler at osgeo.org
Sat Jan 12 15:07:53 EST 2008


Hi Ivan,

thanks for you impressive suggestions! I have right now added a tiny
contribution from Soeren Gebbert with a few bugfixes from me into
GRASS Addons-SVN:

r.rast4d + tg.* - raster time series SQL support

This is a scripted approach to register time series of raster maps
into a SQLite database. Comes in handy when dealing with thousands
of MODIS maps for example.

Markus

On Jan 11, 2008 9:02 PM, Ivan Shmakov <ivan at theory.asu.ru> wrote:
>                         It seems impossible to doubt that everything in
>                         the universe can be represented by numbers [...]
>                                 -- N. I. Lobachevsky
>
>         Reading ``Time series in GRASS'' page [1], as well as [2, 3],
>         made me wonder, is time the only parameter one may need to lay
>         the data sets along of?  Arguably, it's not.
>
>         Consider, for example, one having to compare the behaviour of
>         MM5 modelling results with different models or parameters used.
>         There, related rasters are laid along the model index or model
>         parameter's value.
>
>         Another example are the rasters comprising of the values of a
>         meterological variable for certain (often non-uniformly spaced)
>         values of pressure.  These sets of raster data sets shouldn't be
>         turned into 3D rasters, since the pressure to height
>         correspondence varies over space and time.
>
>         The above makes me believe that the generic facility to keep the
>         relations between the rasters is necessary.  Besides,
>         implementing this facility allows for several other problems to
>         be addressed within its framework, as I'd try to show below.
>
> * Several related rasters: a rasterset?
>
>         Both of the examples above suggested using numeric values to
>         represent the relationship between the rasters.  These values
>         can include:
>
>         * timestamp (in seconds since epoch), allowing for time series
>           [1];
>
>         * layer's pressure;
>
>         * model index or model parameter;
>
>         * were quality flags applied to the raster (1) or not (0)?
>
>         Let me define rasterset as a named collection of related
>         rasters, each unambiguously identified by an arbitrary number of
>         the arbitrary numeric values.
>
>         Below, I assume using 2D rasters at the lower level of the
>         rasterset implementation, since 3D rasters could easily be
>         simulated by a rasterset with a `z' as the parameter.
>
> * Tiled raster storage
>
>         The most simple case of using the rasterset facility is to
>         implement tiled raster storage [3].
>
>         Indeed, a tiled raster could be implemented with each tile
>         becoming a raster within a single rasterset, and then being
>         assigned a pair of numeric parameters -- the indices of the
>         tile.
>
>         Since the spatial resolution of the tile may differ (the rasters
>         comprising the dataset are almost as independent as the
>         individual rasters in GRASS currently), this allows for both the
>         whole-NULL tiles (no raster for this tile indices), and for the
>         same-value tiles (1x1 raster covering the whole region.)
>
>         For the usage of this feature is supposed to be quite common, I
>         believe it needs to be implemented at the ``core'' of the
>         rasterset implementation, with the appropriate optimizations
>         applied for some common cases.
>
> * Metadata
>
>         Since the rasters comprising the rasterset are allowed to carry
>         an arbitrary number of additional numeric parameters, this
>         facility could assume handling of certain (though not arbitrary)
>         metadata, even in cases where these additional parameters aren't
>         strictly necessary for the identification purposes.
>
>         However, with each raster being assigned a category, it's
>         possible to associate arbitrary information with it using a
>         database connected to the rasterset.
>
> * Color maps
>
>         Color maps are currently tied rather closely to the rasters they
>         are used for, making it hardly practical to use different color
>         maps for the same rasters.  This feature could be used, for
>         example, to apply different color maps when displaying the data
>         and producing the printed output.
>
>         Would the color maps be detached from the rasters, it may become
>         feasible to allow for a color map to be shared among several
>         rasters.
>
>         I've already mentioned a raster's parameter as a possible
>         substitute for `z' (both for simulating `z' for ordinary 3D
>         rasters, and for storing layers of data for which layer index to
>         `z' mapping varies over space and time.)  Moreover, for digital
>         elevation models `z' value is actually the value stored in
>         raster.  It may be worth investigated whether this relation
>         could be turned inside out, to allow for arbitrary value to
>         arbitrary value mappings be stored as 2D (or 1D) rasters within
>         a rasterset.
>
>         There may be demand for storing quite arbitrary arrays in the
>         future as well.
>
> * Scanning radiometers & Time
>
>         Due to the curvature of the Earth surface, a satellite scanning
>         radiometer such as MODIS sees certain places on Earth multiple
>         times in a short period of time (about 1.48 s for MODIS.)
>
>         These places appear on consequent scans on L2 data.  The most
>         common practices to deal with this effect are either to average
>         the values obtained for the same place, or to take the one value
>         that is, after some criterion, superior to the other.
>
>         However, allowing for the scans to be stored independently along
>         with a ``time'' value associated with each would allow one to
>         analyze these very short-term changes (if any.)
>
> * RDBMS as the backend
>
>         Probably the most appealing feature of the rasterset model is
>         its supposed flexibility.  As mentioned above, the color maps
>         could be represented as the rasters in their very own coordinate
>         space, and so could be the ground control points (*).
>
>         With the number of separate data structures to form a raster
>         reduced, it could become feasible to put these structures into a
>         general purpose RDBMS system, thus partially addressing both the
>         disk space and the large number of files in a directory concerns
>         [4].
>
>         (*) It's very common for the satellite Level 2 data to specify
>         the latitudes and longitudes for the centres of the pixels as
>         the separate rasters.  These could be mapped directly to the
>         specific rasters within the rasterset.  See [5] for a related
>         feature in GDAL.
>
> * Views
>
>         The names aren't convenient for rasters.  For example, I have a
>         location full of rasters with the names like:
>
> 2007-05-31-grans-std-qual-o3
> 2007-05-31-grans-std-toto3std
> 2007-05-31-grans-std-toto3std.qa
> 2007-05-31-grans-std-toto3stderr
> ...
>
>         The total number of the 2D data sets for each day is over 70,
>         most of which come in both the ``no quality flags applied'' form
>         (without the `.qa' suffix) and the ``standard quality flags
>         applied'' one (with one.)  And the source data do include even
>         more data sets.
>
>         In order to handle this amount of data efficiently the system
>         should allow one to limit the namespace to the data sets
>         matching arbitrary criterions.  I don't consider the GUI
>         specifically, since it may become rather tedious to filter the
>         g.mlist(1) output with grep(1) in scripts as well.
>
>         The rasterset model seems to be a more appropriate solution.
>         And, as suggested in the next section, there could be a way to
>         name a specific raster within the rasterset.  With this
>         functionality available from scripts, one could easily apply
>         arbitrary schemes for naming the data sets.
>
> * User interface
>
>         With the rastersets being implemented, GRASS database becomes to
>         look much more like a relational one.  Since the individual
>         rasters are no longer named individually (rather, they share the
>         common rasterset name and are identified by the associated
>         values of the arbitrary parameters), to access a specific raster
>         one would need to issue a query.  (Much like accessing a table's
>         row with SQL queries.)
>
>         Certainly, to expose the very exciting new features the
>         rasterset model could offer, the UI (both the command line and
>         the graphical parts) would require a major overhaul.  However,
>         for the compatibility's sake, it's reasonable to implement the
>         current raster accessing interface on top of the rasterset
>         facility, thus allowing for the existing code (and therefore
>         interface) to be retained.
>
>         Then, there would have to be a mapping of the compatibility
>         raster names to the (rastername, parameters) pairs, and the
>         corresponding utilities to manage it, both in the library API
>         and the UI, like:
>
> GRASS> r.bind \
>            raster=compat-airs-2007-05-31-total-ozone.qa \
>            rasterset=airs-total-ozone \
>            parameter="timestamp=2007-05-31 21:35:24 +0000" \
>            parameter="qaflags_p=true"
>
>         Parameters not specified are to be allowed to match all, but not
>         any, of the rasters.  Thus, it won't be needed to specify the
>         individual tile indices for the tiled rasters to mean the whole
>         spatial extent of the rasterset.  If several rasters match the
>         specification, but do not complement each other spatially, an
>         error is signalled, like:
>
> GRASS> r.bind \
>            raster=dummy \
>            rasterset=airs-total-ozone \
>            parameter="qaflags_p=true"
> r.bind: several rasters match the specification
> GRASS>
>
>         Importing utilities (r.in.gdal, or r.import) would need to be
>         changed early to allow for both the rasterset name and the
>         identifying parameters to be specified.  The other modules could
>         be changed as the time permits.
>
>         The rasters imported may be automatically named according to an
>         arbitrary user-specified scheme with the ``hooks'' facility
>         being implemented in GRASS.  (I hope to present my ideas
>         regarding such a facility in a separate posting.)
>
> * Notes for the implementor
>
>         The model described above could be based on the current 2D
>         rasters implementation after cleaning it of the extra features
>         to be provided by the rasterset model itself.
>
>         Within the model, the 2D rasters facility is the lower level,
>         and its interface would need to be changed.  For the
>         compatibility's sake, the former interface would need to be
>         provided by the code layered on top of the rasterset
>         implementation.
>
>         The rasterset model is to be implemented mostly from scratch.
>
> [1] http://grass.gdf-hannover.de/wiki/Time_series_in_GRASS
> [2] http://grass.gdf-hannover.de/wiki/GRASS_7_ideas_collection
> [3] http://grass.gdf-hannover.de/wiki/Replacement_raster_format
> [4] http://freegis.org/cgi-bin/viewcvs.cgi/grass/gips/gip-0002.txt?rev=HEAD&co
> ntent-type=text/vnd.viewcvs-markup
> [5] http://trac.osgeo.org/gdal/wiki/rfc4_geolocate
>
> _______________________________________________
> grass-dev mailing list
> grass-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/grass-dev
>



-- 
Open Source Geospatial Foundation
http://www.osgeo.org/
http://www.grassbook.org/


More information about the grass-dev mailing list