[gdal-dev] Memory leaks when using SQL queries with shapefiles

Pierluigi Guasqui guasqui at actgate.com
Fri Jul 13 07:22:34 PDT 2012


Even,

I opened a ticket in GDAL Trac: http://trac.osgeo.org/gdal/ticket/4749.

Thank you,
Pierluigi

> Il 13/07/2012 12.14, Even Rouault ha scritto:
>> Selon Pierluigi Guasqui <guasqui at actgate.com>:
>>
>>> Hello,
>>>
>>> I am experiencing a big memory allocation (with no subsequent memory
>>> release) when using SQL queries with shapefiles. This is a sample code
>>> that triggers this problem:
>>>
>>>       OGRRegisterAll();
>>>
>>>       // opening shapefile data source
>>>       const char *fname = "path/to/shapefile.shp";
>>>       OGRDataSource *poDS;
>>>       poDS = OGRSFDriverRegistrar::Open( fname, FALSE );
>>>
>>>       if( poDS == NULL )
>>>       {
>>>           fprintf( stderr, "%s: could not open input file\n", fname );
>>>           return -1;
>>>       }
>>>
>>>       // creating the spatial filter
>>>       OGRLineString regionPolygon;
>>>       OGRRawPoint *pointsList = new OGRRawPoint[5];
>>>       pointsList[0].x = -180; pointsList[0].y =  90;
>>>       pointsList[1].x =  180; pointsList[1].y =  90;
>>>       pointsList[2].x =  180; pointsList[2].y = -90;
>>>       pointsList[3].x = -180; pointsList[3].y = -90;
>>>       pointsList[4].x = -180; pointsList[4].y =  90;
>>>       regionPolygon.setPoints(5, pointsList);
>>>       delete[] pointsList;
>>>
>>>       // executing the query
>>>       const char *query = "SELECT * FROM table_name WHERE some_field 
>>> LIKE
>>> '%bla%' ORDER BY some_other_field";
>>>       OGRLayer *rs = poDS->ExecuteSQL( query, &regionPolygon, NULL );
>>>
>>>       // releasing memory
>>>       poDS-> ReleaseResultSet( rs );
>>>       OGRDataSource::DestroyDataSource( poDS );
>>>
>>> With a 100.000 records shapefile "ExecuteSQL" has allocated over 300
>>> MBytes of memory with no deallocation when calling "ReleaseResultSet".
>>>
>>> The memory leak is triggered only when "ExecuteSQL" is called with a
>>> spatial filter (even trivial as shown in the above code) and the SQL
>>> query contains a 'WHERE' clause and an 'ORDER BY' clause. Is this a 
>>> bug?
>>>
>>> I am using GDAL 1.9.1 under Windows platform.
>>>
>>> Any help would be really appreciated!
>> Pierluigi,
>>
>> I have not a dev environmenet at the moment to investigate that, but 
>> could you
>> open a ticket in GDAL Trac with all the details so it does not get 
>> lost ?
>>
>> Are you positive that, all other things unchanged, the presence of 
>> the spatial
>> filter makes a difference ? I'm not sure how you check on Windows if 
>> the memory
>> is really released (if you put your code in a while(1) {} loop, does 
>> the memory
>> used by the process increase at each iteration ?).
>>
>> I also suspect that your spatial filter will not do what you would 
>> (probably)
>> expect from it. The geometry of the filter is a LINESTRING, not a 
>> POLYGON. So it
>> will only filter geometries that intersect the line, not geometries 
>> that are
>> fully inside it.
>>
>> Best regards,
>>
>> Even
>
> Even,
>
> thank you for looking into my problem and for your reply. Yes I will 
> open a ticket in GDAL Trac.
>
> I tried different test cases so I am sure that If I leave all other 
> things unchanged and just remove the use of the spatial filter I do 
> not see any memory leak. So it seems that the use of the spatial 
> filter does make the difference. Also, I tried the SQL query without 
> the "WHERE" clause or the "ORDER BY" clause and, again, I did not see 
> any memory leak. So also the use of "WHERE" clause *and* "ORDER BY" 
> clause seem to make the difference.
>
> To check on Windows that memory is not released I put a "Sleep()" call 
> by the end of my code so I could check with Task Manager the process 
> memory allocation before the process quits. However, I added a 
> while(true) loop around my previous code:
>
>     OGRRegisterAll();
>     int loop_i = 1;
>     while ( true )
>     {
>         fprintf( stdout, "*** Loop %d... ", loop_i++ );
>         fflush( stdout );
>
>         // opening shapefile data source
>         const char *fname = "path/to/shapefile.shp";
>         OGRDataSource *poDS;
>         poDS = OGRSFDriverRegistrar::Open( fname, FALSE );
>
>         if( poDS == NULL )
>         {
>             fprintf( stderr, "%s: could not open input file\n", fname );
>             return -1;
>         }
>
>         // creating the spatial filter
>         OGRLineString regionPolygon;
>         OGRRawPoint *pointsList = new OGRRawPoint[5];
>         pointsList[0].x = -180; pointsList[0].y =  90;
>         pointsList[1].x =  180; pointsList[1].y =  90;
>         pointsList[2].x =  180; pointsList[2].y = -90;
>         pointsList[3].x = -180; pointsList[3].y = -90;
>         pointsList[4].x = -180; pointsList[4].y =  90;
>         regionPolygon.setPoints(5, pointsList);
>         delete[] pointsList;
>
>         // executing the query
>         const char *query = "SELECT * FROM table_name WHERE some_field 
> LIKE '%bla%' ORDER BY some_other_field";
>         OGRLayer *rs = poDS->ExecuteSQL( query, &regionPolygon, NULL );
>
>         // releasing memory
>         poDS-> ReleaseResultSet( rs );
>         OGRDataSource::DestroyDataSource( poDS );
>
>         fprintf( stdout, "done\n" );
>         fflush( stdout );
>     }
>
> and I could get my process to allocate almost 1 GBytes of memory in 
> just 3 loops! So, yes memory keeps on being allocating without being 
> released at each iteration.
>
> Related to the use of LINESTRING filter instead of a POLYGON filter, 
> yes you are right: it will filter out features that intersect my line 
> but this is what I expect from the code from which I extrapolated the 
> example above. However, thank you for warning me!
>
> Thank you!
>


-- 
Pierluigi Guasqui
Applied Coherent Technology Corp.




More information about the gdal-dev mailing list