Re: [gdal-dev] Bigtiff question

Lucena, Ivan ivan.lucena at pmldnet.com
Fri Mar 6 08:38:40 EST 2009


Frank,

I understand that the geotiff driver is not going to do 'seek & write' for every single pixel but since I am not giving it 
a change to manage the blocks intelligently (as you said) it is probably doing ever worse than that. Like Even said, it 
is probably getting to a point that it needs to not only write but also read from the geotiff file in order to update 
the tiles/strips. My solution to the problem would be to change the way I loop through the single-banded input files 
and update the geotiff nicely. I don't think we need to pursue on that potential bug but Even got my Python script 
anyway.

Thanks a lot.

My best regards,

Ivan

>  -------Original Message-------
>  From: Frank Warmerdam <warmerdam at pobox.com>
>  Subject: Re: [gdal-dev] Bigtiff question
>  Sent: Mar 05 '09 17:07
>  
>  Lucena, Ivan wrote:
>  > Yes, that runs a lot of seek's to writes just few bytes here and there.
>  
>  Ivan,
>  
>  I would note that for pixel interleaved data, access is still a whole
>  strip/tile at a time which in your case likely means a whole scanline.
>  In no case does GDAL's GTiff driver seek along to update individual
>  bytes in a pixel interleaved scanline or tile.
>  
>  > I am wondering what the geotiff driver could do to improve that; keeping tiles in memory until they are filled 
up for
>  > writing at once for example (?)
>  
>  GDAL will cache the blocks on a band-by-band basis (at a level
>  where it doesn't realize the underlying datastore is pixel
>  interleaved).  The actual block flushing code in the GTIFF driver
>  does ensure that all the cache data for all bands is assembled
>  and written at once if available.  So if you had a big enough
>  block cache - or if you wrote all bands for a given scanline at
>  approximately the same time - then only one write to disk would
>  take place for each block.
>  
>  But because you write "all of the first band", then all of the second
>  band and so on, you are basically triggering cache writes often and
>  preventing GDAL from doing things intelligently.
>  
>  > BTW, would make any difference if tile the geotiff? In that case what would be the blockxsize, blockysize 
recommended for 320 bands interleaved by PIXEL?
>  
>  I do not anticipate this would make much difference.  As noted,
>  the key factors affecting performance are block cache size, and
>  the order you write data.
>  
>  >>  OK, it sounds like the pixels all being zero is a bug, and
>  >>  it would be good to file a ticket demonstrating this problem.
>  >>  Hopefully a somewhat minimalist example of the problem.
>  >
>  > I think it would be very hard so send data samples so I would suggest running a script that creates fake raster 
bands
>  > with all pixels as 1 on band 1, 2 on band 2, etc. Something like that perhaps:
>  >
>  > --
>  >     driver_tif = gdal.GetDriverByName("GTIFF")
>  >     output_dst = driver_tif.Create( output_tif, x_size, y_size, serie_count, data_type,
>  >         [ 'TILED=NO', 'INTERLEAVE=PIXEL' ])
>  >     for i in range(320):
>  >         output_band = output_dst.GetRasterBand( 1 + 1 )
>  >         output_band.Fill(i + 1)
>  >         output_band.FlushCache()
>  > --
>  
>  Well, please develop such a script, confirm it reproduces the
>  bug (ideally with a well known binary version of GDAL like the
>  OSGeo4W package) and then file the bug accordingly.  You might want
>  to check if it really needs a lot of bands to trigger the issue.
>  
>  The position I *hate* to be in is doing a lot of guessing trying
>  to reproduce a bug.
>  
>  Best regards,
>  --
>  ---------------------------------------+--------------------------------------
>  I set the clouds in motion - turn up   | Frank Warmerdam, warmerdam at pobox.com
>  light and sound - activate the windows | http://pobox.com/~warmerdam
>  and watch the world go round - Rush    | Geospatial Programmer for Rent
>  
>  



More information about the gdal-dev mailing list