[GRASS-dev] some detail questions on i.segment

Fri Jun 3 01:05:52 PDT 2016

On Wed, Jun 1, 2016 at 2:24 PM, Moritz Lennert
<mlennert at club.worldonline.be> wrote:
> Hi,
>
> Another question concernant i.segment's details:
>
> IIUC, the threshold used in region-growing is normalized using a common
> denominator defined by:
>
> divisor = globals->nrows + globals->ncols;
> (BTW, why '+',  not '*' ?)
The divisor would become too large and the adjustment would have no effect.
>
> Row and column numbers come from
>
> globals->nrows = Rast_window_rows();
> globals->ncols = Rast_window_cols();
>
> The threshold is then adjusted to take into account object size, i.e. to
> favor merging of smaller regions compared to merging of larger regions:
>
> adjthresh = pow(alpha2, 1. + (double) smaller / divisor);
>
> It is this adjusted threshold that is used to decide whether to merge
> regions or not, depending on whether their similarity is smaller than this
> adjusted threshold or not.:
>
> if (compare_double(Ri_similarity, adjthresh) == -1)
>
> Ri_similarity is normalized by
>
> val /= globals->max_diff;
>
> where globals->max_diff is defined as the difference between max anf min
> values in the input file, as obtained by
>
> Rast_get_fp_range_min_max(&(fp_range[n]), &min[n], &max[n]);
>
> I hope that I've understood all of this correctly.
>
> Now my question:
>
> Are nrows and ncols region-dependent, i.e. will the divisor in the
> calculation of the adjusted vary depending on the region I defined ?

Yes, nrows and ncols come from the current region.
>
> And max->diff do I understand correctly that Rast_get_fp_range_min_max() is
> region-independent, i.e. that if I take different regions of the same image,
> I will always get the same max_diff ?

Yes, this way you can test settings on a small region before applying
them to a larger region.

>
> If this is correct, does this mean the region size might determine whether
> some objects (or pixels) are merged or not ?

Yes. In effect, the computational region size determines whether an
object is large or small.
>
> This would put into question the determination of a good threshold by
> testing on small regions as the same threshold might not have the same
> effect in larger regions, or ?

Indeed. There are no comments in the code (apart from "TODO: better")
explaining the reason for this adjustment. In theory it makes sense to
me to favour merging of smaller regions, or more precisely, to avoid
merging of larger regions. "Small" and "large" depend on the
computational region. When testing for the previous GSoC project on
i.segment, I did not notice drastic differences when changing the
computational region.

Markus

>
>
>
>
> On 20/05/16 18:40, Markus Metz wrote:
>>
>> Hi Moritz,
>>
>> On Wed, May 18, 2016 at 6:36 PM, Moritz Lennert
>> <mlennert at club.worldonline.be> wrote:
>>>
>>> Hi Markus,
>>>
>>> I'm working on potentially improbing the i.segment.uspo addon and am
>>> looking
>>> at the possibility of including the goodness of fit output map somehow in
>>> the evaluation of the quality of the segmentation.
>>>
>>> For that, I need to exactly understand the goodness of fit measure.
>>>
>>> As a starter: why is the threshold parameter (globals->alpha) squared
>>> before
>>> being used in create_isegs.c (and in the calculation of the goodness of
>>> fit)
>>> ? Is it because i.segment works with the squared distance and not the
>>> actual
>>> distance ?
>>
>>
>> Yes, i.segment works with the squared distance to avoid sqrt() which
>> is slow. All that matters is if the distance is larger or smaller than
>> threshold, and this relation is the same with squared values.
>>
>>>
>>> IIUC, the worst goodness of fit measure (i.e. 1 - difference) is equal to
>>> the 1 - threshold parameter value. This thus means that if one would want
>>> to
>>> compare segmentations done with different threshold values by comparing
>>> mean
>>> goodness of fit, for example, this would have to be scaled taking into
>>> account the respective parameter value. Would something like
>>>
>>> ( goodness of fit - (1 - threshold parameter value) )  / threshold
>>> parameter
>>> value
>>>
>>> make sense ?
>>
>>
>> The goodness of fit is currently 1 - similarity by comparing the
>> current cell values to the object's mean values. Similarity is in the
>> range [0, 1], 0 means identical, 1 means maximum possible difference.
>> With the region growing algorithm, that difference can actually be
>> larger than the given threshold if a cell is included in an object and
>> subsequent growing of the object shifts the mean away.
>>
>>>
>>> BTW, in write_output.c, in the comments starting at line 82, there is
>>> mention of a globals->threshold, but there is not threshold in the
>>> globals
>>> structure... I guess this should read globals->alpha or
>>> threshold->answer,
>>> or ?
>>
>>
>> The comments starting at line 82 in write_output.c are an idea for
>> goodness of fit, the actual goodness of fit is calculated in lines 168
>> and 182.
>>
>> HTH,
>>
>> Markus
>>
>
>
> --
> Département Géosciences, Environnement et Société
> Université Libre de Bruxelles
> Bureau: S.DB.6.138
> CP 130/03
> Av. F.D. Roosevelt 50
> 1050 Bruxelles
> Belgique
>
> tél. + 32 2 650.68.12 / 68.11 (secr.)
> fax  + 32 2 650.68.30