[gdal-dev] Creating modified copies of a file

Even Rouault even.rouault at mines-paris.org
Thu Jul 7 18:34:02 EDT 2011


Le vendredi 08 juillet 2011 00:25:04, Cole, Derek a écrit :
> Even, thanks for the help by the way. Hopefully I can return the favor some
> day, as it looks like I am going to be getting quite familiar with this.
> 
> I think the caching value was the issue perhaps, 

Yes certainly, if you provided a very small value to GDALSetCacheMax() 
(smaller than the size of one block), you will run into very intensive cache 
trashing because you won't have enough memory to store more than one block at 
a time... And that can cause terrible performance due to a huge number of I/O 
access.

And to answer the question of your previous email, if you set the 
GDAL_CACHEMAX configuration option, you must do it before the first time GDAL 
tries to access to it internally, so in practice, to be safe, before 
initializing GDAL with GDALAllRegister(). Indeed, after the first access, the 
value is stored into an internal variable and isn't read any more from the 
configuration option.

> and I just happened to get
> it working correclty in the C version. Oh well, at least if someone finds
> this thread, they can see both ways of doing it.
> 
> Derek
> ________________________________________
> From: gdal-dev-bounces at lists.osgeo.org [gdal-dev-bounces at lists.osgeo.org]
> on behalf of Cole, Derek [dcole at integrity-apps.com] Sent: Thursday, July
> 07, 2011 6:20 PM
> To: gdal-dev at lists.osgeo.org
> Subject: RE: [gdal-dev] Creating modified copies of a file
> 
> NOTICE:  A potentially dangerous attachment(s) to this email was
> identified.  Please do NOT open any attachments, unless you are expecting
> them, and they are from a trusted source, as unsolicited attachments can
> contain malware.  Please call the Helpdesk @ x.275 for assistance or
> questions.
> 
> Thank you,
> The IAI IT Team
> 
> C:
> 
>         GDALDatasetH    hDataset, hOutDS;
>         const char              *pszSource=NULL, *pszDest=NULL, *pszFormat
> = "NITF"; char                **papszCreateOptions = NULL;
>         GDALProgressFunc    pfnProgress = GDALTermProgress;
>         GDALDriverH hDriver;
> 
>         hDriver = GDALGetDriverByName( pszFormat );
>         pszSource = this->loadedFile.toStdString().c_str();
>         pszDest = "testout.ntf";
>         hDataset = GDALOpenShared( pszSource, GA_ReadOnly );
> 
>     char **papszOptions = NULL;
> 
>     papszOptions = CSLSetNameValue( papszOptions, "BLOCKXSIZE", "1024" );
>     papszOptions = CSLSetNameValue( papszOptions, "BLOCKYSIZE", "1024" );
>     papszOptions = CSLSetNameValue( papszOptions, "ABPP", "11" );
> 
>     hOutDS = GDALCreateCopy( hDriver, pszDest, hDataset, FALSE,
> papszOptions, pfnProgress, NULL );
> 
> C++:
> 
>  GDALProgressFunc    pfnProgress = GDALTermProgress;
>     const char *pszFormat = "NITF";
>     GDALDataset *poNITFDS;
>     char ** papszMetadata;
>     GDALDriver *poNITFDriver;
>     GDALDataset  *poDataset;
>     const char *pszDstFilename = "test1.ntf";
> 
>     poDataset = (GDALDataset *) GDALOpenShared(
> this->loadedFile.toStdString().c_str(), GA_ReadOnly );
> 
>     poNITFDriver = (GDALDriver *)GDALGetDriverByName("NITF");
>     papszMetadata = poNITFDriver->GetMetadata();
> 
>     if( poNITFDriver == NULL )
>         exit( 1 );
> 
>     char **papszOptions = NULL;
> 
>     const char *srcProjection = poDataset->GetGCPProjection();
> 
>     papszOptions = CSLSetNameValue( papszOptions, "BLOCKXSIZE", "1024" );
>     papszOptions = CSLSetNameValue( papszOptions, "BLOCKYSIZE", "1024" );
>     papszOptions = CSLSetNameValue( papszOptions, "ABPP", "11" );
> 
>     poNITFDS = poNITFDriver->CreateCopy(pszDstFilename, poDataset, FALSE, 
> papszOptions, pfnProgress, NULL);
> 
> 
> Perhaps you can tell me. These are the direct copies of the two latest
> versions of the code I tried. It could very well be that if I tried the
> C++ version again with the cache storage values corrected, it would speed
> up. I may try that just to make sure.
> 
> Also, you may be right about the cache options. I noticed just now in the
> API declaration, it calls the variable nBytes, but down in the API it just
> calls it nNewSize, and I think I had read somewhere that the command line
> option was in MB, so that may be my bad. Is my assertion that it must come
> before registering drivers correct?
> 
> ________________________________________
> From: Even Rouault [even.rouault at mines-paris.org]
> Sent: Thursday, July 07, 2011 6:07 PM
> To: gdal-dev at lists.osgeo.org
> Cc: Cole, Derek
> Subject: Re: [gdal-dev] Creating modified copies of a file
> 
> Le jeudi 07 juillet 2011 23:46:11, Cole, Derek a écrit :
> > So that this may help anyone in the future - it seems like I may have
> > solved the problem for now! After looking through gdal_translate, it
> > appears that they are using the C API instead of C++ for the CreateCopy.
> > I switched my code to the C API instead of C++ just for consistency
> > sake, and sure enough, my CreateCopy was sped up tremendously!
> 
> Really ??? That doesn't make any sense. The C function GDALCreateCopy() is
> just a one line wrapper over the C++ API. I'm 200% sure there must be
> another difference in the values of the arguments between your C and C++
> call. You should recheck...
> 
> > For the smaller test file I was playing with, the output file was not
> > viewable in my viewer until I made some adjustments, primarily, setting
> > the block size of the copy to be the same as the original. if you dont
> > set it explicitely you get row-wise blocks.
> 
> Yes, row-wise blocks are the default. Tiling must be explicitely set up
> with teh BLOCKXSIZE and BLOCKYSIZE creation options.
> 
> > One other quirk - I was not able to use the GDALSetCacheMax() function ,
> 
> What do you mean by "not being able to use the GDALSetCacheMax()" ? Are you
> sure you specified the value in bytes as specified in the doc. I suspect
> you must have set a value in megabytes, because when the GDAL_CACHEMAX
> configuration option is read, there's a magic to automatically multiply
> small values (< 100000) by one million. That could explain why you manage
> to achieve the desired effect by using CPLSetConfigOption().
> 
> > and instead switched that to the CPLSetConfigOption() function, AND I had
> > to move it to be before I call GDALAllRegister() it appears, or I was
> > still getting cache thrashing warnings.
> > 
> > If I get some time, or if someone else does, it might be nice to narrow
> > down what exactly in the C++ API was causing the problem.
> > 
> > Derek
> > ________________________________________
> > From: gdal-dev-bounces at lists.osgeo.org [gdal-dev-bounces at lists.osgeo.org]
> > on behalf of Cole, Derek [dcole at integrity-apps.com] Sent: Thursday, July
> > 07, 2011 3:20 PM
> > To: gdal-dev at lists.osgeo.org
> > Subject: RE: [gdal-dev] Creating modified copies of a file
> > 
> > I actually just out of curiosity tried to open the file given by this
> > command:
> > 
> > gdal_translate dan_data.ntf dan_data_translate.ntf
> > 
> > And it turns out that dan_data_translate.ntf was actually a TIFF file
> > that I could view. I had to explicitly add the option -of NITF
> > 
> > in which case it still did the translate very fast, but I did get
> > warnings:
> > 
> > $ gdal_translate dan_data.ntf dan_data_translate.ntf -of NITF
> > Input file size is 8689, 7679
> > Warning 6: NITF only supports WGS84 geographic and UTM projections.
> > 
> > 0...10...20...30...40...50...60...70...80...90...100 - done.
> > ERROR 6: Apparently no space reserved for IGEOLO info in NITF file.
> > NITFWriteIGEOGLO() fails.
> > ERROR 6: NITF only supports WGS84 geographic and UTM projections.
> > 
> > 
> > Which jives with what my code was giving me, and further,
> > dan_data_translate.ntf is not able to be opened using my viewer (which
> > uses GDAL to open the files I have been working with, of course, and does
> > open dan_data.ntf) because of a segmentation fault  in the RasterIO
> > method. I am able to open the gdal_translate file, and my CreateCopy()
> > file in another ELT, such as ENVI.
> > 
> > I will also make note that the original file is exactly 72.0 MB, and the
> > output file is 63.6MB, but both files say in gdalinfo they are not using
> > compression. This is the same for my CreateCopy() or gdal_translate. So
> > at least I have narrowed down that I am apparently getting the same
> > end-result as gdal_translate, albeit way slower. Finally, the output
> > file seems to have created row-blocks, instead of 1024x1024 blocks like
> > the original.
> > 
> > I will continue working to replicate gdal_translate exactly, even though
> > now i am not optimistic about getting a good result.
> > 
> > ________________________________________
> > From: gdal-dev-bounces at lists.osgeo.org [gdal-dev-bounces at lists.osgeo.org]
> > on behalf of Cole, Derek [dcole at integrity-apps.com] Sent: Thursday, July
> > 07, 2011 2:48 PM
> > To: gdal-dev at lists.osgeo.org
> > Subject: RE: [gdal-dev] Creating modified copies of a file
> > 
> > Yes, the source data is a NITF. Here is the first few lines of the
> > projection info  in that source NITF: Coordinate System is:
> > PROJCS["unnamed",
> > 
> >     GEOGCS["WGS 84",
> >     
> >         DATUM["WGS_1984",
> >         
> >             SPHEROID["WGS 84",6378137,298.257223563,
> >             
> >                 AUTHORITY["EPSG","7030"]],
> >             
> >             TOWGS84[0,0,0,0,0,0,0],
> >             AUTHORITY["EPSG","6326"]],
> >         
> >         PRIMEM["Greenwich",0,
> >         
> >             AUTHORITY["EPSG","8901"]],
> >         
> >         UNIT["degree",0.0174532925199433,
> >         
> >             AUTHORITY["EPSG","9108"]],
> >         
> >         AUTHORITY["EPSG","4326"]],
> >     
> >     PROJECTION["Transverse_Mercator"],
> >     PARAMETER["latitude_of_origin",0],
> >     PARAMETER["central_meridian",-75],
> >     PARAMETER["scale_factor",0.996],
> >     PARAMETER["false_easting",500000],
> >     PARAMETER["false_northing",0]]
> > 
> > I suppose I can go back through my code and change it to be exactly like
> > gdal_translate, which seems to use the C bindings instead of C++.  I did
> > end up changing my progress reporting function to the same one that is in
> > gdal_translate as well, which had no change. I will try the debugging
> > route before too much trouble, to see what I can find out.
> > 
> > Thanks for the help so far. I will report back what I find.
> > 
> > 
> > 
> > ________________________________________
> > From: Even Rouault [even.rouault at mines-paris.org]
> > Sent: Thursday, July 07, 2011 2:13 PM
> > To: gdal-dev at lists.osgeo.org
> > Cc: Cole, Derek
> > Subject: Re: [gdal-dev] Creating modified copies of a file
> > 
> > Le jeudi 07 juillet 2011 19:52:34, Cole, Derek a écrit :
> > > Well, I am about to give up on this I think and go a different route
> > > for creating a NITF from this data.
> > > 
> > > I tried a small test file, and it takes about 45s to CopyCrate() an
> > > 80mb NITF.
> > 
> > Yes, that's very poor performance !
> > 
> > > here is the output I got after turning on CPL_DEBUG
> > > 
> > > GDAL: GDALOpen(/home/dcole/eraser/Images/dan_data.ntf, this=0x1ad7ec70)
> > > succeeds as NITF. GDAL: QuietDelete(test1.ntf) invoking Delete()
> > > GDAL: GDALOpen(test1.ntf, this=0x1ad81180) succeeds as NITF.
> > > GDAL: GDALDefaultOverviews::OverviewScan()
> > > GDAL: GDALClose(test1.ntf, this=0x1ad81180)
> > > Warning 6: NITF only supports WGS84 geographic and UTM projections.
> > > 
> > > 0GDAL: Potential thrashing on band 1 of
> > > /home/dcole/eraser/Images/dan_data.ntf.
> > > ...10...20...30...40...50...60...70...80...90...100 - done.
> > > ERROR 6: Apparently no space reserved for IGEOLO info in NITF file.
> > > NITFWriteIGEOGLO() fails.
> > > ERROR 6: NITF only supports WGS84 geographic and UTM projections.
> > > 
> > > GDAL: GDALDefaultOverviews::OverviewScan()
> > > ERROR 6: NITF only supports WGS84 geographic and UTM projections.
> > 
> > This error seems to indicate that your source dataset has an incompatible
> > projection with what NITF allows (UTM/WGS84 and LongLat/WGS84). But this
> > is weird because your source dataset seems to be a NITF itself... What
> > does gdalinfo returns as projection for
> > /home/dcole/eraser/Images/dan_data.ntf ?
> > 
> > But I don't believe it relates with performance. And I suppose you get
> > the same warnings when using gdal_translate.
> > 
> > > So to try to remedy this I changed the cachemax (programmatically)
> > > anywhere from 40MB to 3GB trying to see if any setting had any effect.
> > > It was about the same no matter what.
> > 
> > For a 80MB file, playing with cachemax won't give any significant
> > speed-up.
> > 
> > > I tried doing a gdal_translate on this file, which only took about a
> > > second.
> > 
> > That's much more reasonable. Well, then that shows it's not an issue in
> > GDAL per se, but how you use it. Now, you "just" have to discover what
> > things you do which are different from gdal_translate.
> > 
> > > Looking through the source for that, it seems like I am doing
> > > approximately the same thing in relation to the CreateCopy(), so I am
> > > not sure what the problem is. However, my viewer which uses GDAL to
> > > read the original NITF file I am reading in will NOT read the
> > > translated NITF file that is generated.
> > 
> > I've just reviewed the code snippet you pasted yesterday and I see that
> > you use a custom progress function. Isn't there a sleep() or some slow
> > operation in it ?
> > 
> > For debugging performance issues, I start by just running the code under
> > a debugger and to break and resume execution regularly. In 90% of the
> > cases, you will see easily the bottleneck because the same function
> > comes again and again. Otherwise you might need more advanced tools.
> > sysprof or kcachegrind under Linux might be usefull to get timing
> > profiles.
> > 
> > > Any idea why the gdal_translate might be running faster than my
> > > CreateCopy, even though the parameters are just about identical?
> > > 
> > > Derek
> > > ________________________________________
> > > From: Even Rouault [even.rouault at mines-paris.org]
> > > Sent: Wednesday, July 06, 2011 5:32 PM
> > > To: gdal-dev at lists.osgeo.org
> > > Cc: Cole, Derek
> > > Subject: Re: [gdal-dev] Creating modified copies of a file
> > > 
> > > Le mercredi 06 juillet 2011 23:18:21, Cole, Derek a écrit :
> > > > I have noticed that when attempting to do this, I am able to do a
> > > > gdalinfo on the source and destination files (even if the destination
> > > > file is not complete, it seems the header gets created, and gdalinfo
> > > > recognizes teh file).
> > > > 
> > > > Is it possible that this is taking so long because my destination
> > > > file has no geocoords or projection information? It seems like
> > > > createcopy is not replicating all of the variables in the original
> > > > and I am having to manually set some things, such as Block size,
> > > > ABPP, etc etc.
> > > 
> > > This is expected. When doing CreateCopy() the target driver has no idea
> > > what the source driver is, so he will not recognize ABPP. As far as
> > > block sizes are concerned, it is up to the user to decide if he wants
> > > to keep or change the block dimension of the source dataset.
> > > 
> > > I'm a bit surprised that CreateCopy() is *that* much slower than cp,
> > > but it is hard to know why without a hard work of profiling. It might
> > > be related to pixel or band interleaving, etc... There might cache
> > > trashing. You could perhaps run with CPL_DEBUG=ON and see if
> > > interesting info comes up.
> > > 
> > > > And the
> > > > coordinates and projection information is blank in the new file. Is
> > > > there perhaps some calculations being done to correct that issue, and
> > > > thats why the copy is taking forever?
> > > 
> > > No, not related at all.
> > > 
> > > > I thought based on the documentation
> > > > CreateCopy() would clone the file's raster, as well as size, type,
> > > > projection, geotransform, etc, but all of this is not showing up in
> > > > the partially created file at least.
> > > 
> > > You cannot reliably trust gdalinfo on a file still being written.
> > > Depending on the driver some information might just be written at the
> > > end of the process. A quick look at the CreateCopy() implementation of
> > > the NITF driver suggests that this is indeed the case since the
> > > geotransform is set after the imagery.
> > > _______________________________________________ gdal-dev mailing list
> > > gdal-dev at lists.osgeo.org
> > > http://lists.osgeo.org/mailman/listinfo/gdal-dev
> > 
> > _______________________________________________
> > gdal-dev mailing list
> > gdal-dev at lists.osgeo.org
> > http://lists.osgeo.org/mailman/listinfo/gdal-dev
> > _______________________________________________
> > gdal-dev mailing list
> > gdal-dev at lists.osgeo.org
> > http://lists.osgeo.org/mailman/listinfo/gdal-dev
> > _______________________________________________
> > gdal-dev mailing list
> > gdal-dev at lists.osgeo.org
> > http://lists.osgeo.org/mailman/listinfo/gdal-dev
> 
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev


More information about the gdal-dev mailing list