[GRASS-dev] [GRASS GIS] #3198: r.stats.quantile: hardcoded max number of categries in base map
GRASS GIS
trac at osgeo.org
Mon Nov 7 02:49:03 PST 2016
#3198: r.stats.quantile: hardcoded max number of categries in base map
--------------------------+---------------------------------------
Reporter: mlennert | Owner: grass-dev@…
Type: defect | Status: new
Priority: normal | Milestone: 7.2.1
Component: Raster | Version: unspecified
Resolution: | Keywords: r.stats.quantile MAX_CATS
CPU: Unspecified | Platform: Unspecified
--------------------------+---------------------------------------
Comment (by glynn):
Replying to [ticket:3198 mlennert]:
> Is there any specific reason for this ? I would like to use
r.stats.quantile in i.segment.stats to calculate percentiles per segment,
but number of segments can be much higher than 1000.
The limit was added so that if someone tries to use a base map with a
million categories, it just fails quickly, rather than attempting
something which will either exhaust memory or take days to run.
For each category in the base map, it allocates a basecat structure, each
of which references several dynamically-allocated arrays. The .slots and
.slot_bins arrays are sized based upon the bins= option, the .values array
is sized to hold all of the values falling into any bin containing to a
quantile, the .quants and .bins arrays according to the number of
quantiles.
As well as the memory consumption, almost all processing is per-category.
Having said that, more categories will tend to result in less data per
category. However, there are some non-trivial per-category overheads. On
the other hand, sorting the bins containing quantiles should be faster
overall with more bins but proportionally less data in each bin.
There's no fundamental reason why the limit can't be raised; or even
abolished, if you don't mind an unsuitable choice of base map resulting in
"unable to allocate" errors, or just taking forever. Consider putting a
limit on num_cats*num_slots; a map with many categories should presumably
require fewer bins (assuming that the data isn't concentrated into a
handful of categories).
--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3198#comment:1>
GRASS GIS <https://grass.osgeo.org>
More information about the grass-dev
mailing list