[GRASS-dev] i.segment: possible to cache results of open_files() for several runs of i.segment ?
Moritz Lennert
mlennert at club.worldonline.be
Wed Aug 2 22:02:23 PDT 2017
On 02/08/17 21:43, Markus Metz wrote:
> Hi Moritz,
>
> On Wed, Aug 2, 2017 at 2:52 PM, Moritz Lennert
> <mlennert at club.worldonline.be <mailto:mlennert at club.worldonline.be>> wrote:
> >
> > Hi MarkusM,
> >
> > Working on segmentation parameter optimization with fairly large
> images we have stumbled upon some questions (ISTR that we've discussed
> this before, but I cannot find traces of that discussion). As a
> reminder, i.segment.uspo works by looping through a series of threshold
> parameter values, segmenting a series of test regions at each parameter
> value and then comparing the results in order to identify the "optimal"
> threshold.
> >
> > Two issues have popped up:
> >
> > - One approach we tried was to optimize thresholds separately for
> different types of morphological zones. For each type we have several
> polygons distributed across the image. These polygons are used as input
> for a mask. However, it does seem that even if most of the image is
> masked, open_files() takes a long time, as if it does read the entire
> image. Is this expected / normal ? Would it be possible to reduce the
> read time when most of the area is masked ?
>
> You could reduce the read time by zooming to the current mask with
> g.region zoom=current_mask
Yes, but this doesn't help for situations where the mask areas are
distributed across the entire image, so that the region will be almost
as large as the original image.
>
> >
> > - More generally: for every i.segment call, open_files() goes through
> the reading of the input files and, AFAIU, checks for min/max values and
> creates the seglib temp files (+ possibly other operations). When
> segmenting the same image several times just using different thresholds,
> it would seem that most of what open_files() does is repeated in exactly
> the same manner at each call. Would it be possible to cache that
> information somehow and to instruct i.segment to reuse that info each
> time it is called on the same image and region ?
>
> The most time (and disk space) consuming part of open_files() is
> creating temporary files for the input files and the current region and
> the current mask. These temporary files are temporary because too many
> things can change between two consecutive runs of any module using the
> segment library. First of all, the input files could change (same name,
> but different cell values), then region and mask settings could change.
Agreed, but here I'm talking about the situation where I run i.segment
multiple times in a loop with exactly the same input, and only threshold
value (and possibly minsize) changing. So we hoped that it would be
possible to reuse the segment library files.
>
> >
> > Just trying to crunch larger and larger images... :-)
>
> As in, it's working but a bit slow?
i.segment is definitely not slow compared to other similar software, but
in this specific case of looping the accumulated time used in the phase
of reading the input files grows to a significant duration.
> I have one or two ideas about how to speed up i.segment a little bit,
> but unfortunately no time to implement them right now :-(
Don't worry. I'll just keep sending itches, until you can't bare not
scratching them ;-)
Moritz
More information about the grass-dev
mailing list