[GRASS-dev] Object-based image classification in GRASS
Moritz Lennert
mlennert at club.worldonline.be
Thu Oct 31 02:09:20 PDT 2013
Hi Pietro,
On 31/10/13 00:34, Pietro Zambelli wrote:
> Hi Moritz,
>
> I'm writing some modules (in python) to basically do the same thing.
Great ! Then I won't continue on that and rather wait for your stuff. Do
you have code, yet (except for i.segment.hierarchical) ? Don't hesitate
to publish early.
I think once the individual elements are there, it should be quite easy
to cook up a little binding module which would allow to choose
segmentation parameters, the variables to use for polygon
characterization, the classification algorithm, etc and then launch the
whole process.
>
> I'm trying to apply a Object-based classification for a quite big area
> (the region is more than 14 billions of cells).
>
> At the moment I'm working with a smaller area with "only" ~1 billions of
> cells, but it is still quite challenging.
14 billion _is_ quite ambitious ;-)
I guess we should focus on getting the functionality, first and then
think about optimisation for size...
>
> To speed-up the segmentation process I did the i.segment.hierarchical
> module [0]. that split the region in several tiles, compute the segment
> for each tile, patch all the tiles together and run a last time i
> segment using the patched map as a seed.
Any reason other than preference for git over svn for not putting your
module into grass-addons ?
> for a region of 24k row for 48k cols it required less than two hour to
> run and patch all the tiles, and more than 5 hours to run the "final"
> i.segment over the patched map (using only 3 iterations!).
That's still only 7 hours for segmentation of a billion-cell size image.
Not bad compared to other solutions out there...
> From my experience I can say that the use "v.to.db" is terribly slow if
> you want to apply to a vector map with more than 2.7 Millions of areas.
> So I've develop a python function that compute the same values, but it
> is much faster that the v.to.db module, and should be possible to split
> the operation in several processes for further speed up... (It is still
> under testing).
Does your python module load the values into an attribute table ? I
would guess that that's the slow part in v.to.db. Generally, I think
that's another field where optimization would be great (if possible):
database interaction, notably writing to tables. IIUC, in v.to.db there
is a seperate update operation for each feature. I imagine that there
must be a faster way to do this...
>
> On Wednesday 30 Oct 2013 21:04:22 Moritz Lennert wrote:
>
> > - It uses the v.class.mlpy addon module for classification, so that
>
> > needs to be installed. Kudos to Vaclav for that module ! It currently
>
> > only uses the DLDA classifier. The mlpy library offers many more, and I
>
> > think it should be quite easy to add them. Obviously, one could also
>
> > simply export the attribute table of the segments and of the training
>
> > areas to csv files and use R to do the classification.
>
> I'm extended to use tree/k-NN/SVM Machine learning from MLPY [1] (I've
> used also Parzen, but the results were not good enough) and to work also
> with the scikit [2] classifiers.
You extended v.class.mlpy ? Is that code available somewhere ?
>
> Scikit it seems to have a larger community and should be easier to
> install than MLPY, and last but not least it seems generally faster [3].
I don't have any preferences on that. Colleagues here use R machine
learning tools.
>
> > - Many other variables could be calculated for the segments: other
>
> > texture variables (possibly variables by segment, not as average of
>
> > pixel-based variables, cf [1]), other shape variables (cf the new work
>
> > of MarkusM on center lines and skeletons of polygons in v.voronoi), band
>
> > indices, etc. It would be interesting to hear what most people find
> useful.
>
> I'm working to add also a C function to the GRASS library to compute the
> barycentre and the a polar second moment of Area (or Moment of Inertia),
> that return a number that it is independent from the orientation and
> dimension.
Great ! I guess the more the merrier ;-)
See also [1]. Maybe its just a small additional step to add that at the
same time ?
>
> > - I do the step of digitizing training areas in the wxGUI digitizer
>
> > using the attribute editing tool and filling in the 'class' attribute
>
> > for those polygons I find representative. As already mentioned in
>
> > previous discussions [2], I do think that it would be nice if we could
>
> > have an attribute editing form that is independent of the vector
> digitizer.
>
> I use the i.gui.class to generate the training vector map, and then use
> this map to select the training areas, and export the final results into
> a file (at the moment only csv and npy formats are supported).
How do you do that ? Do you generate training points (or small areas)
and then select the areas these points fall into ?
I thought it best to select training areas among the actual polygons
coming out of i.segment.
> Some days ago I've discussed with MarkusM, that may be I could do a GSoC
> next year to modify the i.segment module to automatically split the
> domain in tiles, run as a multiprocess, and then "patch" only the
> segments that are on the border of the tiles, this solution should be
> much faster than my actual solution[0].
Great idea !
> Moreover we should consider to
> skip to transform the segments into vector to extract the shape
> parameters and extract shape and others parameters (mean, median,
> skewness, std, etc.) directly as last step before to free the memory
> from the segments structures, writing a csv/npy file.
I guess it is not absolutely necessary to go via vector. You could
always leave the option to vectorize the segments, import the parameter
file into a table and then link that table to the vector.
Moritz
[1] https://trac.osgeo.org/grass/ticket/2122
More information about the grass-dev
mailing list