[gdal-dev] help using gdal_grid ...

P Kishor punk.kish at gmail.com
Mon Jun 2 07:22:17 EDT 2008


On 6/2/08, Paul Spencer <pagameba at gmail.com> wrote:
> Thanks for the feedback, Tamas.
>
>  I took the drastic step of going and looking at the code last night :)  I'm
> not a gdal developer so this was a pretty scary step for me.  What I
> discovered is that (in psuedo code) the logic of gdal_grid is:
>
>  gdal_grid
>   for each column in outsize
>     for each row in outsize
>       cell = grid_function()
>
>  grid_function
>   for each point in the input
>     calculate cell value.
>
>  so in my case, even the least expensive operation is looping about
> 133168000000000 times (14.5 million points over a grid of 3200 x 2900) plus
> doing whatever calculation is required.  This doesn't seem efficient to me -
> I can only conceive of 50 points (in my data set) being significant in
> calculating a cell value.  Even several hundred points being considered
> would be a vast improvement.
>
>  Right now, I am considering a couple of options, since it seems unlikely I
> have the skill level to change gdal_grid:
>
>  1. all my data is split over about 1800 files each covering a much smaller
> geographic area than I am computing for the final result.  I think a much
> more efficient option will be to process them individually and combine the
> resulting DEM.  However, I expect this to produce noticeable edge effects.
> So I am thinking of combining the nine files surrounding the one I am
> working on and processing those points for a very small raster (say 32x32)
> and doing a moving window on the points to produce about 1800 tif files that
> can then be merged into a single DEM.
>
>  2. putting all the points into a postgis table and selecting out windows of
> points to calculate cell values and writing some sort of (*gasp*) python
> script to write individual cell values.
>
>  I just did some rough calculations on the last one and assuming 100 ms per
> query and calculation, it should complete in about 25 hours - waaay better!

You are absolutely on the right track. I would put the entire dataset
into SQLite, convert the grid_function() into a SQLite function, and
then update the table with cell values.

With proper INDEXes on the coordinates table, this should be a lot faster.

Could you send me a nod to where the grid_function() is located? (The
source code file name and line numbers). I am curious about this.





>
>  I think a vast improvement could be made to gdal_grid by first processing
> all the input points to index them spatially then only passing a subset of
> the points to the grid function for actual processing.  I have no clue where
> to start with this in gdal/ogr although I suspect that much of the
> functionality is there in the code already.
>
>  Fortunately, I should see Frank today and will bend his ear about this :)
>
>  Cheers
>
>  Paul
>
>
>  On 2-Jun-08, at 5:17 AM, Tamas Szekeres wrote:
>
>
> > Paul,
> >
> > In the recent days I've run into the same performance problem
> > according to the large number of the scattered points. I admit I could
> > get a significant performance increment by reducing the number of the
> > floating point operations in the inner loop of the invdist
> > implementation. I've also created a custom invdist when the power is
> > 2.0 so the that the pow() function can totally be eliminated in this
> > special case. I've been thinking of contributing back of these changes
> > but I'm hesitant to mess the things up with funky optimizations inside
> > the code. Maybe a bug report along with a patch would be more
> > reasonable from me.
> >
> > Best regards,
> >
> > Tamas
> >
> >
> >
> > 2008/6/2 Paul Spencer <pagameba at gmail.com>:
> >
> > > Hi,
> > >
> > > I have a set of x,y,z data in text files.  There are about 1800
> individual
> > > files.  Each file has several thousand points.  The sum total is about
> 14.5
> > > million entries.
> > >
> > > I would like to convert this into a DEM so I can make contours and
> > > hillshade.
> > >
> > > My first attempt has been to concatenate all the files into a single
> file,
> > > create a VRT file, and run gdal_grid on the resulting file.  It took
> about 4
> > > hours for gdal_grid (running at 99.5% of one core on my macbookpro) to
> > > output the first '0' in the progress monitor so I abandoned this process
> :)
> > >
> > > So I would like some advice on how to tune the parameters to gdal_grid
> to
> > > get reasonable results in a more suitable amount of time.
> > >
> > > The data has been collected at approximately 70 meter intervals or less
> > > depending on the terrain.
> > >
> > > The area of coverage is in ESPG:2036 and is
> > >
> > > 2302989.998137,7597784.001173 - 2716502.001863,7388826.998827
> > >
> > > which is about 413000 m x 209000 m
> > >
> > > Questions:
> > >
> > > * what is a reasonable -outsize value?  Originally I though 5900 x 3000
> > > based on the 70 m per measurement thing, but perhaps that is way too
> big?
> > >
> > > * invdist seems to be the slowest algorithm based on some quick tests on
> > > individual files.  Is there much difference between average and nearest?
> > > What values of radius1 and radius2 will work the fastest while still
> > > producing reasonable results of the -outsize above?
> > >
> > > * would it be better to convert from CSV to something else (shp?) first?
> > >
> > > * would it be better to process individual input files then run
> > > gdal_merge.py on the result?
> > >
> > > Cheers
> > >
> > > Paul
> > >
> > > __________________________________________
> > >
> > > Paul Spencer
> > > Chief Technology Officer
> > > DM Solutions Group Inc
> > > http://www.dmsolutions.ca/
> > > _______________________________________________
> > > gdal-dev mailing list
> > > gdal-dev at lists.osgeo.org
> > > http://lists.osgeo.org/mailman/listinfo/gdal-dev
> > >
> > >
> >
>
>  _______________________________________________
>  gdal-dev mailing list
>  gdal-dev at lists.osgeo.org
>  http://lists.osgeo.org/mailman/listinfo/gdal-dev
>


-- 
Puneet Kishor http://punkish.eidesis.org/
Nelson Institute for Environmental Studies http://www.nelson.wisc.edu/
Open Source Geospatial Foundation (OSGeo) http://www.osgeo.org/


More information about the gdal-dev mailing list