[gdal-dev] help using gdal_grid ...

Mon Jun 2 07:29:13 EDT 2008

Paul,

On Sun, Jun 01, 2008 at 09:16:08PM -0400, Paul Spencer wrote:
> I have a set of x,y,z data in text files.  There are about 1800  
> individual files.  Each file has several thousand points.  The sum  
> total is about 14.5 million entries.

For every pixel of the output raster the whole set of the input points
will be revowed. So the best way to go is to split your area in smaller
tiles. If you have the one XYZ file per tile then you can just combine
9 XYZ files together for each output raster tile and do gdal_grid on
that smaller sets. Then gdal_merge resukting raster tiles. This should
be pretty good scriptable, the trick is only required on region borders.

> * what is a reasonable -outsize value?  Originally I though 5900 x  
> 3000 based on the 70 m per measurement thing, but perhaps that is way  
> too big?

That is totally depend on what is your final purpose and expected final
resolution. If you want to get the best possible resolution the choose a
step that is close to average distance between your input points (at
someday computation of this metric will be added to the list of gdal_grid
features :-).

> * invdist seems to be the slowest algorithm based on some quick tests  
> on individual files.  Is there much difference between average and  
> nearest?  What values of radius1 and radius2 will work the fastest  
> while still producing reasonable results of the -outsize above?

I have created the preliminary version of the GDAL Grid tutorial. It
does not contain too much examples yet, but basic information is already
there:

 http://www.gdal.org/grid_tutorial.html

Nearest Neighbour is a way different from the Moving Average. Actually
the best usage of NN is to convert XYZ array, created from the regular
grid back into that grid. If you know that your points are located near
the grid nodes then NN will do the job for you.

> * would it be better to convert from CSV to something else (shp?)
> first?

No way. But I think that importing data in database with tiling approach
suggested above may help. Spatial filtering could be done efficienly
inside DB.

> * would it be better to process individual input files then run  
> gdal_merge.py on the result?

Yes, I think so.

Best regards,
Andrey

-- 
Andrey V. Kiselev
ICQ# 26871517