[GRASS-dev] Object-based image classification in GRASS

Thu Oct 31 02:09:20 PDT 2013

Hi Pietro,

On 31/10/13 00:34, Pietro Zambelli wrote:
> Hi Moritz,
>
> I'm writing some modules (in python) to basically do the same thing.

Great ! Then I won't continue on that and rather wait for your stuff. Do 
you have code, yet (except for i.segment.hierarchical) ? Don't hesitate 
to publish early.

I think once the individual elements are there, it should be quite easy 
to cook up a little binding module which would allow to choose 
segmentation parameters, the variables to use for polygon 
characterization, the classification algorithm, etc and then launch the 
whole process.

>
> I'm trying to apply a Object-based classification for a quite big area
> (the region is more than 14 billions of cells).
>
> At the moment I'm working with a smaller area with "only" ~1 billions of
> cells, but it is still quite challenging.

14 billion _is_ quite ambitious ;-)

I guess we should focus on getting the functionality, first and then 
think about optimisation for size...

>
> To speed-up the segmentation process I did the i.segment.hierarchical
> module [0]. that split the region in several tiles, compute the segment
> for each tile, patch all the tiles together and run a last time i
> segment using the patched map as a seed.

Any reason other than preference for git over svn for not putting your 
module into grass-addons ?

> for a region of 24k row for 48k cols it required less than two hour to
> run and patch all the tiles, and more than 5 hours to run the "final"
> i.segment over the patched map (using only 3 iterations!).

That's still only 7 hours for segmentation of a billion-cell size image. 
Not bad compared to other solutions out there...

>  From my experience I can say that the use "v.to.db" is terribly slow if
> you want to apply to a vector map with more than 2.7 Millions of areas.
> So I've develop a python function that compute the same values, but it
> is much faster that the v.to.db module, and should be possible to split
> the operation in several processes for further speed up... (It is still
> under testing).

Does your python module load the values into an attribute table ? I 
would guess that that's the slow part in v.to.db. Generally, I think 
that's another field where optimization would be great (if possible): 
database interaction, notably writing to tables. IIUC, in v.to.db there 
is a seperate update operation for each feature. I imagine that there 
must be a faster way to do this...

>
> On Wednesday 30 Oct 2013 21:04:22 Moritz Lennert wrote:
>
>  > - It uses the v.class.mlpy addon module for classification, so that
>
>  > needs to be installed. Kudos to Vaclav for that module ! It currently
>
>  > only uses the DLDA classifier. The mlpy library offers many more, and I
>
>  > think it should be quite easy to add them. Obviously, one could also
>
>  > simply export the attribute table of the segments and of the training
>
>  > areas to csv files and use R to do the classification.
>
> I'm extended to use tree/k-NN/SVM Machine learning from MLPY [1] (I've
> used also Parzen, but the results were not good enough) and to work also
> with the scikit [2] classifiers.

You extended v.class.mlpy ? Is that code available somewhere ?

>
> Scikit it seems to have a larger community and should be easier to
> install than MLPY, and last but not least it seems generally faster [3].

I don't have any preferences on that. Colleagues here use R machine 
learning tools.

>
>  > - Many other variables could be calculated for the segments: other
>
>  > texture variables (possibly variables by segment, not as average of
>
>  > pixel-based variables, cf [1]), other shape variables (cf the new work
>
>  > of MarkusM on center lines and skeletons of polygons in v.voronoi), band
>
>  > indices, etc. It would be interesting to hear what most people find
> useful.
>
> I'm working to add also a C function to the GRASS library to compute the
> barycentre and the a polar second moment of Area (or Moment of Inertia),
> that return a number that it is independent from the orientation and
> dimension.

Great ! I guess the more the merrier ;-)
See also [1]. Maybe its just a small additional step to add that at the 
same time ?

>
>  > - I do the step of digitizing training areas in the wxGUI digitizer
>
>  > using the attribute editing tool and filling in the 'class' attribute
>
>  > for those polygons I find representative. As already mentioned in
>
>  > previous discussions [2], I do think that it would be nice if we could
>
>  > have an attribute editing form that is independent of the vector
> digitizer.
>
> I use the i.gui.class to generate the training vector map, and then use
> this map to select the training areas, and export the final results into
> a file (at the moment only csv and npy formats are supported).

How do you do that ? Do you generate training points (or small areas) 
and then select the areas these points fall into ?

I thought it best to select training areas among the actual polygons 
coming out of i.segment.

> Some days ago I've discussed with MarkusM, that may be I could do a GSoC
> next year to modify the i.segment module to automatically split the
> domain in tiles, run as a multiprocess, and then "patch" only the
> segments that are on the border of the tiles, this solution should be
> much faster than my actual solution[0].

Great idea !

> Moreover we should consider to
> skip to transform the segments into vector to extract the shape
> parameters and extract shape and others parameters (mean, median,
> skewness, std, etc.) directly as last step before to free the memory
> from the segments structures, writing a csv/npy file.

I guess it is not absolutely necessary to go via vector. You could 
always leave the option to vectorize the segments, import the parameter 
file into a table and then link that table to the vector.

Moritz

[1] https://trac.osgeo.org/grass/ticket/2122