[GRASS-dev] i.segment: possible to cache results of open_files() for several runs of i.segment ?

Moritz Lennert mlennert at club.worldonline.be
Wed Aug 2 22:02:23 PDT 2017


On 02/08/17 21:43, Markus Metz wrote:
> Hi Moritz,
> 
> On Wed, Aug 2, 2017 at 2:52 PM, Moritz Lennert 
> <mlennert at club.worldonline.be <mailto:mlennert at club.worldonline.be>> wrote:
>  >
>  > Hi MarkusM,
>  >
>  > Working on segmentation parameter optimization with fairly large 
> images we have stumbled upon some questions (ISTR that we've discussed 
> this before, but I cannot find traces of that discussion). As a 
> reminder, i.segment.uspo works by looping through a series of threshold 
> parameter values, segmenting a series of test regions at each parameter 
> value and then comparing the results in order to identify the "optimal" 
> threshold.
>  >
>  > Two issues have popped up:
>  >
>  > - One approach we tried was to optimize thresholds separately for 
> different types of morphological zones. For each type we have several 
> polygons distributed across the image. These polygons are used as input 
> for a mask. However, it does seem that even if most of the image is 
> masked, open_files() takes a long time, as if it does read the entire 
> image. Is this expected / normal ? Would it be possible to reduce the 
> read time when most of the area is masked ?
> 
> You could reduce the read time by zooming to the current mask with 
> g.region zoom=current_mask

Yes, but this doesn't help for situations where the mask areas are 
distributed across the entire image, so that the region will be almost 
as large as the original image.

> 
>  >
>  > - More generally: for every i.segment call, open_files() goes through 
> the reading of the input files and, AFAIU, checks for min/max values and 
> creates the seglib temp files (+ possibly other operations). When 
> segmenting the same image several times just using different thresholds, 
> it would seem that most of what open_files() does is repeated in exactly 
> the same manner at each call. Would it be possible to cache that 
> information somehow and to instruct i.segment to reuse that info each 
> time it is called on the same image and region ?
> 
> The most time (and disk space) consuming part of open_files() is 
> creating temporary files for the input files and the current region and 
> the current mask. These temporary files are temporary because too many 
> things can change between two consecutive runs of any module using the 
> segment library. First of all, the input files could change (same name, 
> but different cell values), then region and mask settings could change.

Agreed, but here I'm talking about the situation where I run i.segment 
multiple times in a loop with exactly the same input, and only threshold 
value (and possibly minsize) changing. So we hoped that it would be 
possible to reuse the segment library files.

> 
>  >
>  > Just trying to crunch larger and larger images... :-)
> 
> As in, it's working but a bit slow?

i.segment is definitely not slow compared to other similar software, but 
in this specific case of looping the accumulated time used in the phase 
of reading the input files grows to a significant duration.

> I have one or two ideas about how to speed up i.segment a little bit, 
> but unfortunately no time to implement them right now :-(

Don't worry. I'll just keep sending itches, until you can't bare not 
scratching them ;-)

Moritz


More information about the grass-dev mailing list