[postgis-devel] raster: ST_Histogram - Not sure this is a bug or I just don't understand the numbers

Bborie Park dustymugs at gmail.com
Sun Dec 4 18:20:27 PST 2011


On Sun, Dec 4, 2011 at 6:17 PM, Bborie Park <dustymugs at gmail.com> wrote:
> On Sun, Dec 4, 2011 at 5:29 PM, Paragon Corporation <lr at pcorp.us> wrote:
>> I was testing out the ST_Histogram and I was expecting my percentage to dd
>> up to 1 in both cases.
>>
>> for example:
>> This query:
>> SELECT (st_histogram(rast,2,6, ARRAY[5,10,20,50,100,2000])).*
>> FROM  postgis_analysis_20
>> WHERE descrip = 'dbox3 2011-11-25';
>>
>> -- gives me this which doesn't seem to jive with the counts (for example I
>> would expect the 15-25 bucket to be 0.95
>>
>>  min | max  | count |        percent
>> -----+------+-------+-----------------------
>>  10 |   15 |    31 |  0.000563175583613407
>>  15 |   25 | 10557 |    0.0958942683259152
>>  25 |   45 |   220 |  0.000999182487056045
>>  45 |   95 |   135 |  0.000245253883186484
>>  95 |  195 |    48 | 4.36006903442638e-005
>>  195 | 2195 |    18 | 8.17512943954946e-007
>>
>
> I think you're seeing the effect of specifying the bin-widths.  When
> the widths are specified, the percentage is not count / total but
> rather count / total / width, so is proportionate.  In the link below,
> look at Data by Proportion.
>
> http://en.wikipedia.org/wiki/Histogram#Examples
>
> I added that ability for when bin-widths are specified, but don't know
> if it should remain due to the confusion...
>

I'm leaning towards removing the division by width...



More information about the postgis-devel mailing list