[gdal-dev] How to decrease file size of colored raster bands?

afernandez afernandez at odyhpc.com
Thu Jun 1 10:34:33 PDT 2023


Hello Laurentiu,
The color palette is working fine. The issues seem to be caused by using UInt16 because I changed to GDT_Byte per your suggestion and everything is so much smoother (this was an oversight of mine as I recycled a snippet from somewhere and should have paid much closer attention to this detail). 
As for compressing and using overviews (for future use), I'm having some difficulty implementing them but it's less urgent and will keep working at it.
Thanks,
Arturo
Laurențiu Nicola via gdal-dev wrote:
Hi,
There are a couple of things you can try:
you seem to be scaling the data to 8-bit, but saving it as UInt16; GDT_Byte should work, yielding a 50% reduction. I didn't quite understand the comment about the color, does the palette not work with 8-bit images?
you can enable and tune compression: check out COMPRESS, PREDICTOR and the level options in https://gdal.org/drivers/raster/gtiff.html <https://gdal.org/drivers/raster/gtiff.html> . ZSTD and DEFLATE are pretty good.
you can enable the tiled mode (TILED on the same page as above), which sometimes yields smaller files
if you find large files difficult to use, you can enable overview creation or use the COG driver (it's a subset of GTiff)
finally, you can use a lossy compression format like JPEG or LERC, but it's probably not what you want
Regards,
Laurentiu
On Thu, Jun 1, 2023, at 17:35, afernandez wrote:
Hello,
I'm generating a raster file with GDAL. The pseudo-code (where the raster is loaded as 'var') for the colored version reads:
# Initial manipulations
dims = var.dimensions
shape = var.shape
driver_name = 'GTIFF'
driver = gdal.GetDriverByName(driver_name) 
np_dtype = var.dtype
type_code = gdal_array.NumericTypeCodeToGDALTypeCode(np_dtype)
gdal_ds = driver.Create(_my_path_, cols, rows, 1, gdal.GDT_UInt16) 
gdal_ds.SetProjection(_my_projection_)
gdal_ds.SetGeoTransform(_my_transformation_)
# Creation of the bands and scaled matrix 
band = gdal_ds.GetRasterBand(1) 
data = var[_chosen_index_]
data = ma.getdata(data)
data_scaled = np.interp(data, (data.min(), data.max()), (0, 255))
data_scaled2 = data_scaled.astype(int) # This is to rescale into integers so that it can color the layer
# *** Lines to set up the color palette ***
# Write the array to band once everything has been rescaled
band.WriteArray(data_scaled2)
gdal_ds.FlushCache() 
This works well but the problem is that the generated file becomes too large and difficult to work with. If I change to a black and white representation by simply changing the 7th line to: 
gdal_ds = driver.Create(out_path, cols, rows, 1, type_code) 
The new file has a size smaller than 1% of the colored one. I was wondering if there is anything intermediate (with colors but generating a smaller size) or if I would need a more radical approach such as using a driver different from GTIFF or something else.
Thanks.
_______________________________________________
gdal-dev mailing list
gdal-dev at lists.osgeo.org <mailto:gdal-dev at lists.osgeo.org> 
https://lists.osgeo.org/mailman/listinfo/gdal-dev <https://lists.osgeo.org/mailman/listinfo/gdal-dev> 
Hi,
There are a couple of things you can try:
you seem to be scaling the data to 8-bit, but saving it as UInt16; GDT_Byte should work, yielding a 50% reduction. I didn't quite understand the comment about the color, does the palette not work with 8-bit images?
you can enable and tune compression: check out COMPRESS, PREDICTOR and the level options in https://gdal.org/drivers/raster/gtiff.html <https://gdal.org/drivers/raster/gtiff.html> . ZSTD and DEFLATE are pretty good.
you can enable the tiled mode (TILED on the same page as above), which sometimes yields smaller files
if you find large files difficult to use, you can enable overview creation or use the COG driver (it's a subset of GTiff)
finally, you can use a lossy compression format like JPEG or LERC, but it's probably not what you want
Regards,
Laurentiu
On Thu, Jun 1, 2023, at 17:35, afernandez wrote:
Hello,
I'm generating a raster file with GDAL. The pseudo-code (where the raster is loaded as 'var') for the colored version reads:
# Initial manipulations
dims = var.dimensions
shape = var.shape
driver_name = 'GTIFF'
driver = gdal.GetDriverByName(driver_name) 
np_dtype = var.dtype
type_code = gdal_array.NumericTypeCodeToGDALTypeCode(np_dtype)
gdal_ds = driver.Create(_my_path_, cols, rows, 1, gdal.GDT_UInt16) 
gdal_ds.SetProjection(_my_projection_)
gdal_ds.SetGeoTransform(_my_transformation_)
# Creation of the bands and scaled matrix 
band = gdal_ds.GetRasterBand(1) 
data = var[_chosen_index_]
data = ma.getdata(data)
data_scaled = np.interp(data, (data.min(), data.max()), (0, 255))
data_scaled2 = data_scaled.astype(int) # This is to rescale into integers so that it can color the layer
# *** Lines to set up the color palette ***
# Write the array to band once everything has been rescaled
band.WriteArray(data_scaled2)
gdal_ds.FlushCache() 
This works well but the problem is that the generated file becomes too large and difficult to work with. If I change to a black and white representation by simply changing the 7th line to: 
gdal_ds = driver.Create(out_path, cols, rows, 1, type_code) 
The new file has a size smaller than 1% of the colored one. I was wondering if there is anything intermediate (with colors but generating a smaller size) or if I would need a more radical approach such as using a driver different from GTIFF or something else.
Thanks.
_______________________________________________
gdal-dev mailing list
gdal-dev at lists.osgeo.org <mailto:gdal-dev at lists.osgeo.org> 
https://lists.osgeo.org/mailman/listinfo/gdal-dev <https://lists.osgeo.org/mailman/listinfo/gdal-dev> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20230601/c6952937/attachment-0001.htm>


More information about the gdal-dev mailing list