[gdal-dev] Lengthy GDALRasterAttributeTable output

Mike Toews mwtoews at gmail.com
Wed Jun 1 18:54:48 EDT 2011


I'm struggling to understand why both the aux.xml file and output from
gdalinfo are really bulky in file size for my GeoTIFFs. First, output
from gdalinfo on a 375 KB GeoTIFF file has 196664 lines:

Driver: GTiff/GeoTIFF
Files: HorizB.tif
       HorizB.aux
Size is 317, 301
Coordinate System is:
...skip lines, not relevant...
Band 1 Block=317x6 Type=Float32, ColorInterp=Gray
  Min=195.724 Max=842.538
  Minimum=195.724, Maximum=842.538, Mean=440.591, StdDev=127.420
  Metadata:
    STATISTICS_MINIMUM=195.72373962402
    STATISTICS_MAXIMUM=842.53814697266
    STATISTICS_MEAN=440.59141722129
    STATISTICS_MEDIAN=434.39051816566
    STATISTICS_MODE=559.99110636022
    STATISTICS_STDDEV=127.419575971
    STATISTICS_HISTONUMBINS=65536
    STATISTICS_HISTOMIN=195.72373962402
    STATISTICS_HISTOMAX=842.53814697266
    LAYER_TYPE=athematic
STATISTICS_HISTOBINVALUES=1|0|0|0|... there are 136875 characters on
line 48 ...0|0|1|
<GDALRasterAttributeTable Row0Min="195.7237396240234"
BinSize="0.009869755204831507">
  <FieldDefn index="0">
    <Name>Histogram</Name>
    <Type>1</Type>
    <Usage>0</Usage>
  </FieldDefn>
  <Row index="0">
    <F>1</F>
  </Row>
  <Row index="1">
    <F>0</F>
  </Row>
... skip about 196600 lines ...
 <Row index="65535">
    <F>1</F>
  </Row>
</GDALRasterAttributeTable>

Note: this file was made in RGDAL, and it is just a stand-alone TIFF
file, no aux.xml (I might have deleted it a while ago).

For the second part, I'm using CreateCopy in Python to produce a
similar dataset (same dimensions, also 1 band). The new GeoTIFF file
is 381 KB, but the aux.xml file is nearly 10x larger at 3704 KB
(196632 lines). The driver also produces the output in my console
(yes, all three lines):
Warning 1: Lost metadata writing to GeoTIFF ... too large to fit in tag.
Warning 1: Lost metadata writing to GeoTIFF ... too large to fit in tag.
Warning 1: Lost metadata writing to GeoTIFF ... too large to fit in tag.

And excerpts of the file look like this:
<PAMDataset>
  <PAMRasterBand band="1">
    <GDALRasterAttributeTable Row0Min="195.7237396240234"
BinSize="0.009869755204831507">
      <FieldDefn index="0">
        <Name>Histogram</Name>
        <Type>1</Type>
        <Usage>0</Usage>
      </FieldDefn>
      <Row index="0">
        <F>1</F>
      </Row>
... skip about 196600 lines ...
      <Row index="65535">
        <F>1</F>
      </Row>
    </GDALRasterAttributeTable>
    <Metadata>
      <MDI key="STATISTICS_MINIMUM">195.72373962402</MDI>
      <MDI key="STATISTICS_MAXIMUM">842.53814697266</MDI>
      <MDI key="STATISTICS_MEAN">440.59141722129</MDI>
      <MDI key="STATISTICS_MEDIAN">434.39051816566</MDI>
      <MDI key="STATISTICS_MODE">559.99110636022</MDI>
      <MDI key="STATISTICS_STDDEV">127.419575971</MDI>
      <MDI key="STATISTICS_HISTONUMBINS">65536</MDI>
      <MDI key="STATISTICS_HISTOMIN">195.72373962402</MDI>
      <MDI key="STATISTICS_HISTOMAX">842.53814697266</MDI>
      <MDI key="LAYER_TYPE">athematic</MDI>
      <MDI key="STATI|0|1|... there are 136890 characters on line
196629...0|0|1|</MDI>
    </Metadata>
  </PAMRasterBand>
</PAMDataset>

I can recreate the gdalinfo output with GDAL 1.6.0 and 1.8.0, and I'm
using via Python GDAL version 1.6.0, both on MS Windows. Is there a
way to suppress the creation of the aux.xml file? Is there any way to
"fix" the GeoTIFF file to stop reporting meaningless histograms?

-Mike


More information about the gdal-dev mailing list