[gdal-dev] Heuristics to classify raster data ?

Dmitriy Baryshnikov bishop.dev at gmail.com
Thu Mar 6 12:16:24 PST 2014


Hi Even,

most of all depends what kind of imagery and maps you wish to classify. 
If the maps are classical scanned paper maps, and you want fast 
algorithm - the crosses of meter or degree grid can be the good pattern.
But if we have areal images this will not work, as such images have 
crosses too. But satellites - not. May be some frame of maps can be good 
pattern.

If you have some fragment of maps and images, I think some content 
analysis needed:
- clustering, i.e. http://en.wikipedia.org/wiki/K-means_clustering
-Neural network with learning
-Support vector machine i.e. http://svmlight.joachims.org/ and 
http://en.wikipedia.org/wiki/Support_vector_machine

Also some hash comparison can be used (rather fast)
- perceptual hash compare  i.e. http://www.phash.org/

In all cases input images should be resized to some small sizes and may 
be grayscaled or binarized before analysis.

Best regards,
     Dmitry

06.03.2014 23:19, Even Rouault ?????:
> Hi,
>
> I'd be interested in an algorithm to automate the classification of raster data
> between maps (let's say rendering of OpenStreetMap data, or other digital
> maps) one one side and aerial/satellite imagery on the other side, without
> looking at metadata (bare geotiff typically). This is to help in automating
> bulk of import of data from a media and establishing a first level of
> classification.
>
> Has anyone already done that and has code and/or advice to share, or know a
> software project that would do that ?
>
> Some ideas that came to my mind :
> - maps have typically a much more reduce number of colors than imagery, but
> you may have imagery that has already been transformed to 256 colors to reduce
> storage space.
> - maps have generally a majority color (e.g. white, green), but not in all
> zones (urban zones will have more features)
> - maps have higher spatial frequency (lines, text) whereas imagery will be
> more continuous : use of gradient, and compute statistics on it ?
>
> Even
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20140307/f5350d98/attachment.html>


More information about the gdal-dev mailing list