[gdal-dev] Re: CUDA PyCUDA and GDAL
Frank Warmerdam
warmerdam at pobox.com
Mon Dec 7 13:24:57 EST 2009
Doug_Newcomb at fws.gov wrote:
>
> >Hi Doug,
>
> >I finally tried your parameters and they did work fine for me also. I
> >had something like hundred geotiffs, 400 MB each, and I was pushing
> >them to bigtiff mosaic. I tried first with your *.tif selection and then
> >again by using a virtual raster file as source, created from Mapserver
> >tileindex shapefile with gdalbuildvrt. My Windows computer was handling
> >about 20 GB/hour with cubic resampling (-rc) this time. Parameters
> ?-wo "SKIP_NOSOURCE" --config "GDAL_CACHEMAX=500" -wm=5000
> >seem to have a big influence on efficiency. I wonder if there are some
> >rules of thumb for selecting values of GDAL_CACHEMAX and -wm. You said
> >cachemax is good to be close to the maximum input file size, how about
> >-wm?
>
> >-Jukka Rahkonen-
>
> Jukka,
> I remember seeing someone mention on the mailing list ( Can't recall who
> at the moment) that setting GDAL_CACHEMAX close to the maximum size of
> the input files gave the best performance. I did try bumping
> GDAL_CACHEMAX up to 2000 to see what would happen ( while dropping -wm
> down to 3000, only 6GB of RAM on that computer) , but none of the input
> files I was processing were larger than 500MB and I saw no increase in
> performance. For the -wm parameter, I just gave it the rest of the RAM
> available on the computer, and I did not benchmark while varying that
> number.
>
> Doug
>
>
> Doug Newcomb
> USFWS
> Raleigh, NC
> 919-856-4520 ext. 14 doug_newcomb at fws.gov
> ---------------------------------------------------------------------------------------------------------
> The opinions I express are my own and are not representative of the
> official policy of the U.S.Fish and Wildlife Service or Dept. of the
> Interior. Life is too short for undocumented, proprietary data formats.
> Inactive hide details for "Rahkonen Jukka"
> <Jukka.Rahkonen at mmmtike.fi>"Rahkonen Jukka" <Jukka.Rahkonen at mmmtike.fi>
>
>
> *"Rahkonen Jukka" <Jukka.Rahkonen at mmmtike.fi>*
>
> 12/07/2009 05:06 AM
>
>
>
> To
>
> <gdal-dev at lists.osgeo.org>, <Doug_Newcomb at fws.gov>
>
> cc
>
>
> Subject
>
> Re: CUDA PyCUDA and GDAL
>
>
>
>
> > <Doug_Newcomb <at> fws.gov> writes:
> >
> > Hi Folks,Here's the gdal command (gdal 1.6.2) I used to merge ~3500 1
> meter NAIP
> > quarter quads (uncompressed geotiff TIFF) in 3 UTM projections into
> one Bigtiff Image
> > in the USGS Albers projection. It took about 15 hours ( on a 3 year
> old Intel Core2
> > Duo 64 bit Centos 5.3 Linux box with 6GB RAM) and created an
> uncompressed, tiled,
> > bigtiff file of 485 GB. About 32 GB/hr.
> > gdalwarp -t_srs "+proj=aea +lat_1=29.5+lat_2=45.5 +lat_0=23.0
> +lon_0=-96
> > +x_0=0 +y_0=0 +ellps=GRS80 +datum=NAD83 +units=m no_defs <>"
> > -wo "SKIP_NOSOURCE" --config "GDAL_CACHEMAX=500" -wm=5000
> > -co "TILED=YES" */*.tif /biggis/albers/nc_naip2008.tif
> >
> > In the above command, -t_srs "+proj=aea +lat_1=29.5 +lat_2=45.5
> +lat_0=23.0
> > +lon_0=-96 +x_0=0 +y_0=0 +ellps=GRS80 +datum=NAD83 +units=m no_defs
> <>"
> > indicates the target projection, -wo "SKIP_NOSOURCE" don't write
> > in areas for which there is no data for the current file,
> > --config "GDAL_CACHEMAX=500" set the cache memory at 500MB ( set this
> close
> > to the maximum input file size), -wm=5000 set the warp memory to
> 5000MB ,
> > -co "TILED=YES" create a tiled tiff as output, */*.tif, use all of the
> tiffs
> > in all of the subdirectories as input files ( in this case there was
> one
> > directory for each of the 3 utm zones) ,
> /biggis/albers/nc_naip2008.tif,
> > the output file name and location.
> >
>
> Hi Doug,
>
> I finally tried your parameters and they did work fine for me also. I
> had something like hundred geotiffs, 400 MB each, and I was pushing
> them to bigtiff mosaic. I tried first with your *.tif selection and then
> again by using a virtual raster file as source, created from Mapserver
> tileindex shapefile with gdalbuildvrt. My Windows computer was handling
> about 20 GB/hour with cubic resampling (-rc) this time. Parameters
> -wo "SKIP_NOSOURCE" --config "GDAL_CACHEMAX=500" -wm=5000
> seem to have a big influence on efficiency. I wonder if there are some
> rules of thumb for selecting values of GDAL_CACHEMAX and -wm. You said
> cachemax is good to be close to the maximum input file size, how about
> -wm?
Folks,
I would note that large -wm values can be very counter productive when
used in combination with SKIP_NOSOURCE. The problem is that the larger
the chunk size, you run into the chance that a large window will intersect
a small amount of data and the whole window ends up being processed - most
of it without any real work to do.
Best regards,
--
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up | Frank Warmerdam, warmerdam at pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush | Geospatial Programmer for Rent
More information about the gdal-dev
mailing list