[GRASS-dev] i.segment: possible to cache results of open_files() for several runs of i.segment ?

Markus Metz markus.metz.giswork at gmail.com
Thu Aug 3 01:11:43 PDT 2017


On Thu, Aug 3, 2017 at 7:02 AM, Moritz Lennert <mlennert at club.worldonline.be>
wrote:
>
> On 02/08/17 21:43, Markus Metz wrote:
>>
>> Hi Moritz,
>>
>> On Wed, Aug 2, 2017 at 2:52 PM, Moritz Lennert <
mlennert at club.worldonline.be <mailto:mlennert at club.worldonline.be>> wrote:
>>  >
>>  > Hi MarkusM,
>>  >
>>  > Working on segmentation parameter optimization with fairly large
images we have stumbled upon some questions (ISTR that we've discussed this
before, but I cannot find traces of that discussion). As a reminder,
i.segment.uspo works by looping through a series of threshold parameter
values, segmenting a series of test regions at each parameter value and
then comparing the results in order to identify the "optimal" threshold.
>>  >
>>  > Two issues have popped up:
>>  >
>>  > - One approach we tried was to optimize thresholds separately for
different types of morphological zones. For each type we have several
polygons distributed across the image. These polygons are used as input for
a mask. However, it does seem that even if most of the image is masked,
open_files() takes a long time, as if it does read the entire image. Is
this expected / normal ? Would it be possible to reduce the read time when
most of the area is masked ?
>>
>> You could reduce the read time by zooming to the current mask with
g.region zoom=current_mask
>
>
> Yes, but this doesn't help for situations where the mask areas are
distributed across the entire image, so that the region will be almost as
large as the original image.
>
>>
>>  >
>>  > - More generally: for every i.segment call, open_files() goes through
the reading of the input files and, AFAIU, checks for min/max values and
creates the seglib temp files (+ possibly other operations). When
segmenting the same image several times just using different thresholds, it
would seem that most of what open_files() does is repeated in exactly the
same manner at each call. Would it be possible to cache that information
somehow and to instruct i.segment to reuse that info each time it is called
on the same image and region ?
>>
>> The most time (and disk space) consuming part of open_files() is
creating temporary files for the input files and the current region and the
current mask. These temporary files are temporary because too many things
can change between two consecutive runs of any module using the segment
library. First of all, the input files could change (same name, but
different cell values), then region and mask settings could change.
>
>
> Agreed, but here I'm talking about the situation where I run i.segment
multiple times in a loop with exactly the same input, and only threshold
value (and possibly minsize) changing. So we hoped that it would be
possible to reuse the segment library files.

One problem is that the contents of the temporary files are modified at
runtime and can thus not be re-used for a new run. This is in order to save
disk space and memory, otherwise resource requirements would double if
input and output are kept separate.
>
>>
>>  >
>>  > Just trying to crunch larger and larger images... :-)
>>
>> As in, it's working but a bit slow?
>
>
> i.segment is definitely not slow compared to other similar software, but
in this specific case of looping the accumulated time used in the phase of
reading the input files grows to a significant duration.

Reading input can take some time, but I thought that most of the time is
spent on the actual segmentation which takes substantially longer than
reading the input. Of course reading input maps does require some time, but
I can't see a reasonable solution for creating a permanent cache of the
input data without lots of sanity checks and increasing resource
requirements. I see more potential in the actual segmentation part, maybe
this could be further optimized.

Markus M

>
>> I have one or two ideas about how to speed up i.segment a little bit,
but unfortunately no time to implement them right now :-(
>
>
> Don't worry. I'll just keep sending itches, until you can't bare not
scratching them ;-)
>
> Moritz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/grass-dev/attachments/20170803/5222d50d/attachment.html>


More information about the grass-dev mailing list