[gdal-dev] How to represent multi-dimensional array

Jason Roberts jason.roberts at duke.edu
Tue Jun 22 10:51:48 EDT 2010


> Jason: are there plans for full multi-dimensional / curvilinear /
> multiband grid support in MGET?  What do you think of the rasdaman
> project - any ties with MGET?

The work we are doing in MGET relates to the problem that there are many interesting gridded datasets with dimensions xyz or xyzt, stored in a variety of formats, but there is no client-side API that can wrap the majority of them with a convenient abstraction. We want an API that can, for example, return 3D and 4D numpy arrays and associated metadata (projection, geolocation, etc) regardless of whether the underlying data is stored in 3D/4D OPeNDAP grid, in a 3D/4D netCDF grid, in a directory tree of 2D raster files, or an FTP archive of compressed 2D HDFs. We want to build our analytic tools atop this (e.g. find sea surface temperature fronts) rather than building a variant of each analytic tool that is tailored to the underlying API used to access the specific format, as we do now.

GDAL almost provides this, but it is mainly suited to 2D data. There is some support for 3D and 4D data--slicing netCDF or OPeNDAP into 2D band stacks--but we found it did not provide everything we needed (e.g. no awareness of time). 

The common approaches to this problem are to require the user to convert the data into a single format (e.g. netCDF) that the analytic tools can then read (e.g., the CDO tools), or store the data in a server that can present it using a single API (e.g. ERDDAP). We do not like the former approach because these datasets are huge and we'd like to save the user the time and disk space of a huge conversion. We do not like the latter approach setting up a server is a major task for our typical user, and there is still the disk space problem (and sometimes conversion time).

I'm not familiar with rasdaman. Thanks for mentioning it. After looking briefly, it appears that rasdaman adopts that second approach--store all raster data in a server that offers a single API. Rather than building tools atop rasdaman's API, we would represent data from rasdaman behind our own API, just like we do for OPeNDAP, GDAL, netCDF, etc. I will keep an eye rasdaman and see if there are popular datasets of interest to marine ecologists that are served by it.

Now to answer your original, question. We are not planning to build a general-purpose multidimensional API. We're trying to be general enough to handle the majority of datasets in the marine community but specific enough that tools are easy to write. MGET will support datasets with dimensions xy, xyz, xyt, and xyzt. Virtually all gridded datasets we've encountered have those dimensions.

It will support multiband datasets. It will support regular, rectilinear, or even irregular grids, so long as the cells have four straight edges. At this time, we are not planning anything for curvilinear specifically (e.g. grok the parameters that define the grid and then efficiently perform operations using those, or understand that a side is curved). It will support datasets such as ROMS in which the z coordinate varies with x, y, and t. It will support datasets in which t is regular (e.g. daily images) or irregular (e.g. occasional snapshots, such as NOAA CoastWatch AVHRR) or climatological (e.g. mean MODIS chlorophyll concentration for each of the 12 months).

If you have any more questions, feel free to contact me directly.

Best,

Jason

-----Original Message-----
From: gdal-dev-bounces at lists.osgeo.org [mailto:gdal-dev-bounces at lists.osgeo.org] On Behalf Of Michael Sumner
Sent: Monday, June 21, 2010 7:05 PM
To: gdal-dev at lists.osgeo.org
Subject: Re: [gdal-dev] How to represent multi-dimensional array

I find this categorization helpful:

http://en.wikipedia.org/wiki/Regular_grid

Any GIS will do regular and cartesian grids (but usually bound to 2D),
and with programming constructs you can handle rectlinear or
curvilinear grids of any dimension, but it's not well supported in
high-level software AFAIK. Unfortunately GIS is traditionally bound to
the regular case for 2D only - probably because globe-surface
projections were an over-arching for the early layers.

AFAIK, modern ArcGIS can use GDAL to read rasters, so it would be
profound if it were limited to square cells - the "ESRI ascii grid"
format is limited to square cells, though some software allows
separate XDIM/YDIM values.
http://www.gdal.org/frmt_various.html#AAIGrid

Jason: are there plans for full multi-dimensional / curvilinear /
multiband grid support in MGET?  What do you think of the rasdaman
project - any ties with MGET?

Eonfusion can represent and visualize these, but there is only a
narrow path for generating them via GDAL (auxiliary "time" files or
NetCDF arrays), making full multi-band support a bit awkward - and
then formats like GRIB add another level of complication, interleaving
bands and dimensions etc.

Cheers, Mike.

On Tue, Jun 22, 2010 at 1:17 AM, Jason Roberts <jason.roberts at duke.edu> wrote:
>> > The trick is your netCDF has to meet a bunch
>> > of constraints for ArcGIS to recognize it. It has to have square cells.
>>
>> bingo -- that was one of our key problems -- wait -- they have to be
>> "square" -- rectangular won't do? arrgg!
>
> Oops, I'm pretty sure you're right, they can be rectangular so long as the increment is constant. I recall working with some climate model output that was rectangular with a constant increment (2 deg latitude, 3 deg longitude). I checked the documentation and it confirms this.
>
> The limitation is that it cannot work with rectangular cells that have an irregular increment (what MATLAB calls "plaid", I think), or work with non-rectangular cells.
>
> Jason
>
>
> -----Original Message-----
> From: gdal-dev-bounces at lists.osgeo.org [mailto:gdal-dev-bounces at lists.osgeo.org] On Behalf Of Christopher Barker
> Sent: Saturday, June 19, 2010 5:45 PM
> To: gdal-dev at lists.osgeo.org
> Subject: Re: [gdal-dev] How to represent multi-dimensional array
>
> Thanks to Michael, Joaquim, Ivan, and Jason.
>
> I'll explore some of the tools and suggestions you made.
>
> Michael Sumner wrote:
>> NetCDF will tend to store dimensions in
>> reverse order to the natural one, and I think GDAL reverses that - but
>> you can tell by the dimension and number of your bands, and the named
>> metadata on the GDAL bands.
>
> yup -- I figured that out - it got better when we re-ordered to (t, z,
> y, x).
>
>> NetCDF cannot store multi-attribute arrays (it will store several
>> same-size, same-metadata arrays for that purpose),
>
> Actually, I think it can -- though maybe that's only for netcdf4 (based
> on hdf5), but the conventions don't suggest you do that -- particularly
> if different attributes are different data types.
>
>> Manifold reads in multiple rasters
>
>> Eonfusion will do its best to read the array in its natural state
>
>> R can read NetCDF natively or with GDAL (RNetCDF, ncdf, rgdal
>
> We're not really looking for other tools at this point -- we are doing
> visualization with IDV, which handles 4-d data in netcdf just fine. This
> part is all about getting this data into Arc.
>
> I may deal with it by figuring out what we really need to do in Arc, and
> just exporting that part of the data -- we're writing these files with
> Python anyway, so doing some pre-processing there won't be a big deal.
>
> Joaquim Luis wrote:
>> Don't know if this is what you are looking for but if those netCDF files
>> are of a similar type that one can get from the poet site
>> (http://poet.jpl.nasa.gov/), Mirone has a tool called "Aquamoto" (a tool
>> original developed to show time stamps of a tsunami propagation models)
>> that loads those files and show their content interactively with the
>> help of a slider.
>
> Not much help for this, but it's a cool tool, thanks for the link.
>
> Jason Roberts wrote:
>> I have some experience trying to get ArcGIS to work well with time series
>> satellite imagery and 4D ocean models (e.g. HYCOM, ROMS).
>
> Exactly our situation, here.
>
>> I am part of a working group initiated
>> by ESRI and led by an ESRI program manager (Nawajish Noman) that is trying,
>> in essence, to get the community of users who use both ArcGIS and netCDF to
>> develop some Python geoprocessing tools for ArcGIS that provide more
>> functionality than out-of-box tools already in ArcGIS.
>
> cool -- do you have contact information for that project? Is there code
> anywhere we can get at it?
>
>> 1. The Make NetCDF Raster Layer tool can represent 3D netCDF variables as
>> multiband raster layers.
>
> We'll give this a try. It may be what our GIS person is using already,
> but I got a bit confused by the "Dimension Values parameter". We'll poke
> at it some more.
>
>> The trick is your netCDF has to meet a bunch
>> of constraints for ArcGIS to recognize it. It has to have square cells.
>
> bingo -- that was one of our key problems -- wait -- they have to be
> "square" -- rectangular won't do? arrgg!
>
> Anyway, the way we have it now the cells ar rectangular in meters, but
> we un-projectee them to lat-long, so they are no longer simle
> rectangular -- I think I may change this an output in meters, with the
> projection info. But if it has to be square, we're kind of dea in the
> water...
>
>> It
>> has adhere to the CF or COARDS conventions (I forget which versions)
>
> that we do have.
>
>> 2. Under contract to NOAA, Applied Science Associates built a couple of
>> tools that might be useful: the Environmental Data Connector (EDC) and
>> TimeSlider Extension. Download from
>> http://www.asascience.com/software/downloads/index.shtml, see other parts of
>> the website for more info. EDC was built to download multidimensional
>> OPeNDAP datasets into multiband rasters.
>
> Hmm -- did know about those tools, but didn't realize that EDC was
> OPenDAP based -- nice to know.
>
>> TimeSlider is a UI extension to
>> help with playback of time-series data.
>
> That too -- by the way, I'm pretty sure the Coast Guard funded a bunhc
> of that, for their Search and Rescue tools.
>
>> I think they can both work with
>> netCDFs directly, not just OPeNDAP.
>
> That I didn't know -- we'll try that out.
>
>> 3. If you don't want to use netCDFs, you can fake multidimensionality for
>> some scenarios by building a raster catalog with columns for the time and
>> depth.
>
> I have no idea how to do that -- can GDAL build a raster catalog?
>
>> 4. My group, the Duke University Marine Geospatial Ecology Lab, is currently
>> building 3D and 4D awareness into a collection of tools we publish, Marine
>> Geospatial Ecology Tools (MGET, see
>> http://code.nicholas.duke.edu/projects/mget), built in Python on GDAL and
>> other FOSS packages.
>
> Very cool. I'll keep an eye on that.
>
> Thanks for everyone's help,
>
> -Chris
>
>
>
> --
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
>
_______________________________________________
gdal-dev mailing list
gdal-dev at lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev



More information about the gdal-dev mailing list