[GRASS-user] Organizing spatial (time series) data for mixed GIS environments

Blumentrath, Stefan Stefan.Blumentrath at nina.no
Tue Dec 3 11:59:02 PST 2013


Dear all,

On our Ubuntu server we are about to reorganize our GIS data in order to develop a more efficient and consistent solution for data storage in a mixed GIS environment.
By "mixed GIS environment" I mean that we have people working with GRASS, QGIS, PostGIS but also many people using R and maybe the largest fraction using ESRI products, furthermore we have people using ENIV, ERDAS and some other. Only few people (like me) actually work directly on the server...
Until now I stored "my" data mainly in GRASS (6/7) native format which I was very happy with. But I  guess our ESRI- and PostGIS-people would not accept that as a standard...

However, especially for time series data we cannot have several copies in different formats (tailor-made for each and every software).

So I started thinking: what would be the most efficient and convenient solution for storing a large amount of data (e.g. high resolution raster and vector data with national extent plus time series data) in a way that it is accessible for all (at least most) remote users (with different GIS software). As I am very fond of the temporal framework in GRASS 7 it would be a precondition that I can use these tools on the data without unreasonable performance loss. Another precondition would be that users at remote computers in our (MS Windows) network can have access to the data.

In general, four options come into my mind:

a)      Stick to GRASS native format and have one copy in another format

b)      Use the native formats the data come in (e.g. temperature and precipitation comes in zipped ascii-grid format)

c)       Use PostGIS as a backend for data storage (raster / vector) (linked by (r./v.external.*)

d)      Use another GDAL/OGR format for data storage (raster / vector) (linked by (r./v.external.*)

My question(s) are:
What solutions could you recommend or what solution did you choose?
Who is having experience with this kind of data management challenge?
How do externally linked data series perform compared to GRASS native?

I searched a bit the mailing list and found this: (http://osgeo-org.1560.x6.nabble.com/GRASS7-temporal-GIS-database-questions-td5054920.html) where Sören recommended "postgresql as temporal database backend". However I am not sure if that was meant only for the temporal metadata and not the rasters themselves...
Furthermore in the idea collection for the Temporal framework (http://grasswiki.osgeo.org/wiki/Time_series_development, Open issues section) limitations were mentioned regarding the number of files in a folder, which would be possibly a problem both for file based storage. The ext2 file system had ""soft" upper limit of about 10-15k files in a single directory" but theoretically many more where possible. Other file systems may allow for more I guess... Will usage of such big directories > 10,000 files lead to performance problems?

The "Working with external data in GRASS 7" - wiki entry (http://grasswiki.osgeo.org/wiki/Working_with_external_data_in_GRASS_7) covers the technical part (and to some degree performance issues) very well.  Would it be worth adding a part on the strategic considerations / pros and cons of using external data? Or is that too much user and format dependent?

Thanks for any feedback our thoughts around this topic...

Cheers
Stefan



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/grass-user/attachments/20131203/26672c26/attachment.html>


More information about the grass-user mailing list