[GRASS-dev] Re: [GRASS-user] problems using r.proj with large data set

Glynn Clements glynn at gclements.plus.com
Wed Dec 13 14:27:17 EST 2006


Morten Hulden wrote:

> >>> Is this a problem with large files that I will just have to work around or
> >>> is it something to do with my setup?
> >> Propably the same, very old issue:
> >> http://intevation.de/rt/webrt?serial_num=241
> > 
> > I looked into this a while ago. Unfortunately, you can't use rowio (or
> > a home-grown equivalent), as libgis doesn't allow the projection to be
> > changed while maps are open. So, you have to read the entire input
> > map, close it, change the projection, then write the output map.
> > 
> > To get around the memory issues, you would first need to copy the
> > relevant portion of the input map to a temporary file, then use a
> > cache backed by that file.
> > 
> > The segment library would do the job, although it could add a
> > significant performance overhead.
> 
> Are you sure it would help for all possible projections? The way r.proj 
> works -- reverse-projecting cell-by-cell into the input map and looking 
> up the value of the nearest neighbor cell (default method) -- it seems 
> difficult to always have the right "relevant part" of the input map in 
> memory. Between two cylindrical projections perhaps, but for other types 
> wouldn't there be a lot of swapping in and out of memory of the 
> "relevant parts".

A tile-based cache will work for all practical cases; you just need to
ensure that the cache is large enough. In practice, this means that
you need enough space for all of the tiles corresponding to a single
output row.

Even in the worst (practical) cases, you will only need memory
proportional to the size (width or height) of the source map, rather
than its area.

> Hmm... I guess this is what you mean with performance 
> overhead.

No. I was referring specifically to the design of the segment library,
in particular:

1. Tiles can be any size. The size isn't fixed at compile time, and
isn't constrained to powers of two, so you have to perform two
division/remainder operations to convert an (x,y) pair to a segment
number and offset. If you use powers of two, you only need to use
shift/mask operations, which are faster; using a fixed power of two
would be faster still.

2. Each cell lookup requires a function call. It may be more efficient
to implement the "fast path" (where the requested cell is already in
memory) as a macro.

-- 
Glynn Clements <glynn at gclements.plus.com>




More information about the grass-dev mailing list