[gdal-dev] Convert and make ecw transparent - Part 2

Tue Apr 16 14:09:51 PDT 2013

Hi All,

I'm still working on my process to make my ecw ready for GeoServer and
enable transparency.
I've written about it before and got some valuable feedback that allowed me
to convert my first ecw.
I now have a second ecw that doesn't work properly with my first steps. For
this ecw the resulting artifacts are too big.

I have an aerial photo in ECW format which is composed from several smaller
photos (mosaic) giving me large white areas in the corners. These white
areas need to be one color so Geoserver can make it transparent and I can
put a second layer beneath it. I've tried this with my first ecw file and
it is working great in Geoserver.

These are the steps that I take to end up with a tiff file with correct
nodata values and compression:
*Step 1*: Convert the ecw to tiff and make all white pixels true white:
nearblack -of GTiff -white -o white.tif my.ecw
*Step 2*: Mark white pixels as transparent
gdalwarp --config GDAL_CACHEMAX 500 -wm 500 -multi -of GTiff -srcnodata
"255 255 255" -dstnodata "255 255 255" -co tiled=yes -co "BLOCKXSIZE=512"
-co "BLOCKYSIZE=512" -s_srs EPSG:28992 -overwrite white.tif nodata.tif
*Step 3*: Compress
gdal_translate -of GTiff -a_srs EPSG:28992 -co "TILED=YES" -co
"BIGTIFF=YES" -co "PROFILE=GEOTIFF" -co "BLOCKXSIZE=512" -co
"BLOCKYSIZE=512" -co "COMPRESS=JPEG" -co "JPEG_QUALITY=80" -co
"PHOTOMETRIC=YCBCR" --config GDAL_TIFF_INTERNAL_MASK YES nodata.tif
compressed.tif
*Step 4*: Add overviews
gdaladdo -r average --config COMPRESS_OVERVIEW JPEG --config
PHOTOMETRIC_OVERVIEW YCBCR --config INTERLEAVE_OVERVIEW BAND compressed.tif
2 4 8 16 32 64 128 256

I only do step 2 because nearblack doesn't set the nodata value. Why is
that? Doesn't it make sense to do so or is it technically not possible?

My input ECW file is 3.5GB, white.tif (created by nearblack) is 75GB. After
compression and adding overviews I have a 3.8GB tiff file. This tiff file
has white artifacts at the edge of the photo data and nodata.
I don't understand why. My input file (nodata.tif) for the compression has
proper nodata values set, why doesn't the compression step retain them? I
can imagine due to the compression algorithm some nodata values are changed
but isn't it possible to reset them again before saving?

Because of the artifacts I created a masked tif after step 2 from
nodata.tif. In this masked tiff all data that is not nodata is set to
black. Next I run gdal_polygonize.py on this masked data and after 7 days!
I end up with a 110mb shapefile. I have such a large shapefile because step
2 also set some white pixels in the photo area to nodata and those pixels
are also converted to shapes.
After some vector processing I finally have my outline of my photo area as
a shapefile.
Now I clip the result of step 4 with this shapefile:
*Step 5*: Clip raster with shapefile:
gdalwarp -cutline clip.shp -cl clip -crop_to_cutline --config GDAL_CACHEMAX
500 -wm 500 -multi -of GTiff -srcnodata "255 255 255" -dstnodata "255 255
255" -co tiled=yes -co "BLOCKXSIZE=512" -co "BLOCKYSIZE=512" -s_srs
EPSG:28992 compressed.tif crop.tif

This cropped file looks great, the artifacts are gone but the overviews are
lost.
So I call gdal_addo again:
gdaladdo -r average -ro --config COMPRESS_OVERVIEW JPEG --config
PHOTOMETRIC_OVERVIEW YCBCR --config INTERLEAVE_OVERVIEW PIXEL crop.tif 2 4
8 16 32 64 128 256
Now the artifacts are back again ;(
Now I tried again without compression:
gdaladdo -r average -ro crop.tif 2 4 8 16 32 64 128 256
This results in a huge .ovr file, it is still processing. But at 25% the
size is already 7.2GB

Meanwhile I tried something else:
Because I already have my cutline I've tried combining all steps in to one
command and use the output of step 2 as my input:
gdalwarp -cutline clip.shp -cl clip -crop_to_cutline --config GDAL_CACHEMAX
1500 -wm 500 -multi -wo "USE_OPENCL=TRUE" -of GTiff -srcnodata "255 255
255" -dstnodata "255 255 255" -co tiled=yes -co "BLOCKXSIZE=512" -co
"BLOCKYSIZE=512" -co "BIGTIFF=YES" -co "PROFILE=GEOTIFF" -co
"COMPRESS=JPEG" -co "JPEG_QUALITY=80" -co "PHOTOMETRIC=YCBCR" --config
GDAL_TIFF_INTERNAL_MASK YES -s_srs EPSG:28992 nodata.tif cropped.tif
This time I've used USE_OPENCL=TRUE but it doesn't seems to work. Most
likely my version of gdal doesn't have openCL enabled.
I also increased the GDAL_CACHEMAX. I'm not sure what a good value would
be. I'm running gdal v1.10-64Bit on WinVista64 with 8GB RAM.
While running above command I still have 3GB to spare.
The above step also results in a good tiff file of just 3.1GB, but without
overviews so it loads very slowly.

I see a few enhancements. The first one is to use faster hardware. I've
ordered a fast SSD, it will arrive in a few days.
Hopefully this will speed up my steps.
But the best optimization would be if step 1 and 2 could be combined and if
I don't need to create a masked file to get a cutline.
The combined step only takes 6 hours so that is acceptable, but all goes
wrong when I add overviews which introduces the artifacts again.
All steps mentioned here took me almost 4 weeks of computer processing
time. And still the result is not what I need ;(

So if anybody can help me with adding overviews to my cropped.tif without
loosing compression but also without introducing artifacts I would be very
happy.

Sorry for the verbose mail. I really would like to read your thoughts about
this.

Thanks,

Paul
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20130416/c517180a/attachment.html>