[gdal-dev] Creating modified copies of a file

Cole, Derek dcole at integrity-apps.com
Thu Jul 7 18:25:04 EDT 2011


Even, thanks for the help by the way. Hopefully I can return the favor some day, as it looks like I am going to be getting quite familiar with this.

I think the caching value was the issue perhaps, and I just happened to get it working correclty in the C version. Oh well, at least if someone finds this thread, they can see both ways of doing it.

Derek
________________________________________
From: gdal-dev-bounces at lists.osgeo.org [gdal-dev-bounces at lists.osgeo.org] on behalf of Cole, Derek [dcole at integrity-apps.com]
Sent: Thursday, July 07, 2011 6:20 PM
To: gdal-dev at lists.osgeo.org
Subject: RE: [gdal-dev] Creating modified copies of a file

NOTICE:  A potentially dangerous attachment(s) to this email was identified.  Please do NOT open any attachments, unless you are expecting them, and they are from a trusted source, as unsolicited attachments can contain malware.  Please call the Helpdesk @ x.275 for assistance or questions.

Thank you,
The IAI IT Team

C:

        GDALDatasetH    hDataset, hOutDS;
        const char              *pszSource=NULL, *pszDest=NULL, *pszFormat = "NITF";
        char                **papszCreateOptions = NULL;
        GDALProgressFunc    pfnProgress = GDALTermProgress;
        GDALDriverH hDriver;

        hDriver = GDALGetDriverByName( pszFormat );
        pszSource = this->loadedFile.toStdString().c_str();
        pszDest = "testout.ntf";
        hDataset = GDALOpenShared( pszSource, GA_ReadOnly );

    char **papszOptions = NULL;

    papszOptions = CSLSetNameValue( papszOptions, "BLOCKXSIZE", "1024" );
    papszOptions = CSLSetNameValue( papszOptions, "BLOCKYSIZE", "1024" );
    papszOptions = CSLSetNameValue( papszOptions, "ABPP", "11" );

    hOutDS = GDALCreateCopy( hDriver, pszDest, hDataset, FALSE, papszOptions, pfnProgress, NULL );

C++:

 GDALProgressFunc    pfnProgress = GDALTermProgress;
    const char *pszFormat = "NITF";
    GDALDataset *poNITFDS;
    char ** papszMetadata;
    GDALDriver *poNITFDriver;
    GDALDataset  *poDataset;
    const char *pszDstFilename = "test1.ntf";

    poDataset = (GDALDataset *) GDALOpenShared( this->loadedFile.toStdString().c_str(), GA_ReadOnly );

    poNITFDriver = (GDALDriver *)GDALGetDriverByName("NITF");
    papszMetadata = poNITFDriver->GetMetadata();

    if( poNITFDriver == NULL )
        exit( 1 );

    char **papszOptions = NULL;

    const char *srcProjection = poDataset->GetGCPProjection();

    papszOptions = CSLSetNameValue( papszOptions, "BLOCKXSIZE", "1024" );
    papszOptions = CSLSetNameValue( papszOptions, "BLOCKYSIZE", "1024" );
    papszOptions = CSLSetNameValue( papszOptions, "ABPP", "11" );

    poNITFDS = poNITFDriver->CreateCopy(pszDstFilename, poDataset, FALSE,  papszOptions, pfnProgress, NULL);


Perhaps you can tell me. These are the direct copies of the two latest versions of the code I tried. It could very well be that if I tried the C++ version again with the cache storage values corrected, it would speed up. I may try that just to make sure.

Also, you may be right about the cache options. I noticed just now in the API declaration, it calls the variable nBytes, but down in the API it just calls it nNewSize, and I think I had read somewhere that the command line option was in MB, so that may be my bad. Is my assertion that it must come before registering drivers correct?

________________________________________
From: Even Rouault [even.rouault at mines-paris.org]
Sent: Thursday, July 07, 2011 6:07 PM
To: gdal-dev at lists.osgeo.org
Cc: Cole, Derek
Subject: Re: [gdal-dev] Creating modified copies of a file

Le jeudi 07 juillet 2011 23:46:11, Cole, Derek a écrit :
> So that this may help anyone in the future - it seems like I may have
> solved the problem for now! After looking through gdal_translate, it
> appears that they are using the C API instead of C++ for the CreateCopy. I
> switched my code to the C API instead of C++ just for consistency sake,
> and sure enough, my CreateCopy was sped up tremendously!

Really ??? That doesn't make any sense. The C function GDALCreateCopy() is
just a one line wrapper over the C++ API. I'm 200% sure there must be another
difference in the values of the arguments between your C and C++ call. You
should recheck...

>
> For the smaller test file I was playing with, the output file was not
> viewable in my viewer until I made some adjustments, primarily, setting
> the block size of the copy to be the same as the original. if you dont set
> it explicitely you get row-wise blocks.

Yes, row-wise blocks are the default. Tiling must be explicitely set up with
teh BLOCKXSIZE and BLOCKYSIZE creation options.

>
> One other quirk - I was not able to use the GDALSetCacheMax() function ,

What do you mean by "not being able to use the GDALSetCacheMax()" ? Are you
sure you specified the value in bytes as specified in the doc. I suspect you
must have set a value in megabytes, because when the GDAL_CACHEMAX
configuration option is read, there's a magic to automatically multiply small
values (< 100000) by one million. That could explain why you manage to achieve
the desired effect by using CPLSetConfigOption().

> and instead switched that to the CPLSetConfigOption() function, AND I had
> to move it to be before I call GDALAllRegister() it appears, or I was
> still getting cache thrashing warnings.
>
> If I get some time, or if someone else does, it might be nice to narrow
> down what exactly in the C++ API was causing the problem.
>
> Derek
> ________________________________________
> From: gdal-dev-bounces at lists.osgeo.org [gdal-dev-bounces at lists.osgeo.org]
> on behalf of Cole, Derek [dcole at integrity-apps.com] Sent: Thursday, July
> 07, 2011 3:20 PM
> To: gdal-dev at lists.osgeo.org
> Subject: RE: [gdal-dev] Creating modified copies of a file
>
> I actually just out of curiosity tried to open the file given by this
> command:
>
> gdal_translate dan_data.ntf dan_data_translate.ntf
>
> And it turns out that dan_data_translate.ntf was actually a TIFF file that
> I could view. I had to explicitly add the option -of NITF
>
> in which case it still did the translate very fast, but I did get warnings:
>
> $ gdal_translate dan_data.ntf dan_data_translate.ntf -of NITF
> Input file size is 8689, 7679
> Warning 6: NITF only supports WGS84 geographic and UTM projections.
>
> 0...10...20...30...40...50...60...70...80...90...100 - done.
> ERROR 6: Apparently no space reserved for IGEOLO info in NITF file.
> NITFWriteIGEOGLO() fails.
> ERROR 6: NITF only supports WGS84 geographic and UTM projections.
>
>
> Which jives with what my code was giving me, and further,
> dan_data_translate.ntf is not able to be opened using my viewer (which
> uses GDAL to open the files I have been working with, of course, and does
> open dan_data.ntf) because of a segmentation fault  in the RasterIO
> method. I am able to open the gdal_translate file, and my CreateCopy()
> file in another ELT, such as ENVI.
>
> I will also make note that the original file is exactly 72.0 MB, and the
> output file is 63.6MB, but both files say in gdalinfo they are not using
> compression. This is the same for my CreateCopy() or gdal_translate. So at
> least I have narrowed down that I am apparently getting the same
> end-result as gdal_translate, albeit way slower. Finally, the output file
> seems to have created row-blocks, instead of 1024x1024 blocks like the
> original.
>
> I will continue working to replicate gdal_translate exactly, even though
> now i am not optimistic about getting a good result.
>
> ________________________________________
> From: gdal-dev-bounces at lists.osgeo.org [gdal-dev-bounces at lists.osgeo.org]
> on behalf of Cole, Derek [dcole at integrity-apps.com] Sent: Thursday, July
> 07, 2011 2:48 PM
> To: gdal-dev at lists.osgeo.org
> Subject: RE: [gdal-dev] Creating modified copies of a file
>
> Yes, the source data is a NITF. Here is the first few lines of the
> projection info  in that source NITF: Coordinate System is:
> PROJCS["unnamed",
>     GEOGCS["WGS 84",
>         DATUM["WGS_1984",
>             SPHEROID["WGS 84",6378137,298.257223563,
>                 AUTHORITY["EPSG","7030"]],
>             TOWGS84[0,0,0,0,0,0,0],
>             AUTHORITY["EPSG","6326"]],
>         PRIMEM["Greenwich",0,
>             AUTHORITY["EPSG","8901"]],
>         UNIT["degree",0.0174532925199433,
>             AUTHORITY["EPSG","9108"]],
>         AUTHORITY["EPSG","4326"]],
>     PROJECTION["Transverse_Mercator"],
>     PARAMETER["latitude_of_origin",0],
>     PARAMETER["central_meridian",-75],
>     PARAMETER["scale_factor",0.996],
>     PARAMETER["false_easting",500000],
>     PARAMETER["false_northing",0]]
>
>
> I suppose I can go back through my code and change it to be exactly like
> gdal_translate, which seems to use the C bindings instead of C++.  I did
> end up changing my progress reporting function to the same one that is in
> gdal_translate as well, which had no change. I will try the debugging
> route before too much trouble, to see what I can find out.
>
> Thanks for the help so far. I will report back what I find.
>
>
>
> ________________________________________
> From: Even Rouault [even.rouault at mines-paris.org]
> Sent: Thursday, July 07, 2011 2:13 PM
> To: gdal-dev at lists.osgeo.org
> Cc: Cole, Derek
> Subject: Re: [gdal-dev] Creating modified copies of a file
>
> Le jeudi 07 juillet 2011 19:52:34, Cole, Derek a écrit :
> > Well, I am about to give up on this I think and go a different route for
> > creating a NITF from this data.
> >
> > I tried a small test file, and it takes about 45s to CopyCrate() an 80mb
> > NITF.
>
> Yes, that's very poor performance !
>
> > here is the output I got after turning on CPL_DEBUG
> >
> > GDAL: GDALOpen(/home/dcole/eraser/Images/dan_data.ntf, this=0x1ad7ec70)
> > succeeds as NITF. GDAL: QuietDelete(test1.ntf) invoking Delete()
> > GDAL: GDALOpen(test1.ntf, this=0x1ad81180) succeeds as NITF.
> > GDAL: GDALDefaultOverviews::OverviewScan()
> > GDAL: GDALClose(test1.ntf, this=0x1ad81180)
> > Warning 6: NITF only supports WGS84 geographic and UTM projections.
> >
> > 0GDAL: Potential thrashing on band 1 of
> > /home/dcole/eraser/Images/dan_data.ntf.
> > ...10...20...30...40...50...60...70...80...90...100 - done.
> > ERROR 6: Apparently no space reserved for IGEOLO info in NITF file.
> > NITFWriteIGEOGLO() fails.
> > ERROR 6: NITF only supports WGS84 geographic and UTM projections.
> >
> > GDAL: GDALDefaultOverviews::OverviewScan()
> > ERROR 6: NITF only supports WGS84 geographic and UTM projections.
>
> This error seems to indicate that your source dataset has an incompatible
> projection with what NITF allows (UTM/WGS84 and LongLat/WGS84). But this is
> weird because your source dataset seems to be a NITF itself... What does
> gdalinfo returns as projection for /home/dcole/eraser/Images/dan_data.ntf ?
>
> But I don't believe it relates with performance. And I suppose you get the
> same warnings when using gdal_translate.
>
> > So to try to remedy this I changed the cachemax (programmatically)
> > anywhere from 40MB to 3GB trying to see if any setting had any effect.
> > It was about the same no matter what.
>
> For a 80MB file, playing with cachemax won't give any significant speed-up.
>
> > I tried doing a gdal_translate on this file, which only took about a
> > second.
>
> That's much more reasonable. Well, then that shows it's not an issue in
> GDAL per se, but how you use it. Now, you "just" have to discover what
> things you do which are different from gdal_translate.
>
> > Looking through the source for that, it seems like I am doing
> > approximately the same thing in relation to the CreateCopy(), so I am not
> > sure what the problem is. However, my viewer which uses GDAL to read the
> > original NITF file I am reading in will NOT read the translated NITF file
> > that is generated.
>
> I've just reviewed the code snippet you pasted yesterday and I see that you
> use a custom progress function. Isn't there a sleep() or some slow
> operation in it ?
>
> For debugging performance issues, I start by just running the code under a
> debugger and to break and resume execution regularly. In 90% of the cases,
> you will see easily the bottleneck because the same function comes again
> and again. Otherwise you might need more advanced tools. sysprof or
> kcachegrind under Linux might be usefull to get timing profiles.
>
> > Any idea why the gdal_translate might be running faster than my
> > CreateCopy, even though the parameters are just about identical?
> >
> > Derek
> > ________________________________________
> > From: Even Rouault [even.rouault at mines-paris.org]
> > Sent: Wednesday, July 06, 2011 5:32 PM
> > To: gdal-dev at lists.osgeo.org
> > Cc: Cole, Derek
> > Subject: Re: [gdal-dev] Creating modified copies of a file
> >
> > Le mercredi 06 juillet 2011 23:18:21, Cole, Derek a écrit :
> > > I have noticed that when attempting to do this, I am able to do a
> > > gdalinfo on the source and destination files (even if the destination
> > > file is not complete, it seems the header gets created, and gdalinfo
> > > recognizes teh file).
> > >
> > > Is it possible that this is taking so long because my destination file
> > > has no geocoords or projection information? It seems like createcopy is
> > > not replicating all of the variables in the original and I am having to
> > > manually set some things, such as Block size, ABPP, etc etc.
> >
> > This is expected. When doing CreateCopy() the target driver has no idea
> > what the source driver is, so he will not recognize ABPP. As far as block
> > sizes are concerned, it is up to the user to decide if he wants to keep
> > or change the block dimension of the source dataset.
> >
> > I'm a bit surprised that CreateCopy() is *that* much slower than cp, but
> > it is hard to know why without a hard work of profiling. It might be
> > related to pixel or band interleaving, etc... There might cache
> > trashing. You could perhaps run with CPL_DEBUG=ON and see if interesting
> > info comes up.
> >
> > > And the
> > > coordinates and projection information is blank in the new file. Is
> > > there perhaps some calculations being done to correct that issue, and
> > > thats why the copy is taking forever?
> >
> > No, not related at all.
> >
> > > I thought based on the documentation
> > > CreateCopy() would clone the file's raster, as well as size, type,
> > > projection, geotransform, etc, but all of this is not showing up in the
> > > partially created file at least.
> >
> > You cannot reliably trust gdalinfo on a file still being written.
> > Depending on the driver some information might just be written at the
> > end of the process. A quick look at the CreateCopy() implementation of
> > the NITF driver suggests that this is indeed the case since the
> > geotransform is set after the imagery.
> > _______________________________________________ gdal-dev mailing list
> > gdal-dev at lists.osgeo.org
> > http://lists.osgeo.org/mailman/listinfo/gdal-dev
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
_______________________________________________
gdal-dev mailing list
gdal-dev at lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


More information about the gdal-dev mailing list