[GRASS-dev] [GRASS GIS] #3198: r.stats.quantile: hardcoded max number of categries in base map

GRASS GIS trac at osgeo.org
Mon Nov 7 05:23:12 PST 2016


#3198: r.stats.quantile: hardcoded max number of categries in base map
--------------------------+---------------------------------------
  Reporter:  mlennert     |      Owner:  grass-dev@…
      Type:  defect       |     Status:  new
  Priority:  normal       |  Milestone:  7.2.1
 Component:  Raster       |    Version:  unspecified
Resolution:               |   Keywords:  r.stats.quantile MAX_CATS
       CPU:  Unspecified  |   Platform:  Unspecified
--------------------------+---------------------------------------

Comment (by mlennert):

 Replying to [comment:1 glynn]:
 > Replying to [ticket:3198 mlennert]:
 >
 > > Is there any specific reason for this ? I would like to use
 r.stats.quantile in i.segment.stats to calculate percentiles per segment,
 but number of segments can be much higher than 1000.
 >
 > The limit was added so that if someone tries to use a base map with a
 million categories, it just fails quickly, rather than attempting
 something which will either exhaust memory or take days to run.
 >
 > For each category in the base map, it allocates a basecat structure,
 each of which references several dynamically-allocated arrays. The .slots
 and .slot_bins arrays are sized based upon the bins= option, the .values
 array is sized to hold all of the values falling into any bin containing
 to a quantile, the .quants and .bins arrays according to the number of
 quantiles.
 >
 > As well as the memory consumption, almost all processing is per-
 category.
 >
 > Having said that, more categories will tend to result in less data per
 category. However, there are some non-trivial per-category overheads. On
 the other hand, sorting the bins containing quantiles should be faster
 overall with more bins but proportionally less data in each bin.
 >
 > There's no fundamental reason why the limit can't be raised; or even
 abolished, if you don't mind an unsuitable choice of base map resulting in
 "unable to allocate" errors, or just taking forever.

 A warning was maintained. At least the user is made aware and can stop the
 module.

 > Consider putting a limit on num_cats*num_slots; a map with many
 categories should presumably require fewer bins (assuming that the data
 isn't concentrated into a handful of categories).

 In r69776 MarkusM introduce dynamic bins, although I don't really
 understand what this means ;-).

 More generally: the man page of r.stats.quantile does lack a bit of info
 about its parameters, notably the 'bin' parameter. A short paragraph
 explaining how the module works would be useful.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3198#comment:2>
GRASS GIS <https://grass.osgeo.org>



More information about the grass-dev mailing list