[gdal-dev] Memory leaks when using SQL queries with shapefiles
Pierluigi Guasqui
guasqui at actgate.com
Fri Jul 13 04:22:52 PDT 2012
Il 13/07/2012 12.14, Even Rouault ha scritto:
> Selon Pierluigi Guasqui <guasqui at actgate.com>:
>
>> Hello,
>>
>> I am experiencing a big memory allocation (with no subsequent memory
>> release) when using SQL queries with shapefiles. This is a sample code
>> that triggers this problem:
>>
>> OGRRegisterAll();
>>
>> // opening shapefile data source
>> const char *fname = "path/to/shapefile.shp";
>> OGRDataSource *poDS;
>> poDS = OGRSFDriverRegistrar::Open( fname, FALSE );
>>
>> if( poDS == NULL )
>> {
>> fprintf( stderr, "%s: could not open input file\n", fname );
>> return -1;
>> }
>>
>> // creating the spatial filter
>> OGRLineString regionPolygon;
>> OGRRawPoint *pointsList = new OGRRawPoint[5];
>> pointsList[0].x = -180; pointsList[0].y = 90;
>> pointsList[1].x = 180; pointsList[1].y = 90;
>> pointsList[2].x = 180; pointsList[2].y = -90;
>> pointsList[3].x = -180; pointsList[3].y = -90;
>> pointsList[4].x = -180; pointsList[4].y = 90;
>> regionPolygon.setPoints(5, pointsList);
>> delete[] pointsList;
>>
>> // executing the query
>> const char *query = "SELECT * FROM table_name WHERE some_field LIKE
>> '%bla%' ORDER BY some_other_field";
>> OGRLayer *rs = poDS->ExecuteSQL( query, ®ionPolygon, NULL );
>>
>> // releasing memory
>> poDS-> ReleaseResultSet( rs );
>> OGRDataSource::DestroyDataSource( poDS );
>>
>> With a 100.000 records shapefile "ExecuteSQL" has allocated over 300
>> MBytes of memory with no deallocation when calling "ReleaseResultSet".
>>
>> The memory leak is triggered only when "ExecuteSQL" is called with a
>> spatial filter (even trivial as shown in the above code) and the SQL
>> query contains a 'WHERE' clause and an 'ORDER BY' clause. Is this a bug?
>>
>> I am using GDAL 1.9.1 under Windows platform.
>>
>> Any help would be really appreciated!
> Pierluigi,
>
> I have not a dev environmenet at the moment to investigate that, but could you
> open a ticket in GDAL Trac with all the details so it does not get lost ?
>
> Are you positive that, all other things unchanged, the presence of the spatial
> filter makes a difference ? I'm not sure how you check on Windows if the memory
> is really released (if you put your code in a while(1) {} loop, does the memory
> used by the process increase at each iteration ?).
>
> I also suspect that your spatial filter will not do what you would (probably)
> expect from it. The geometry of the filter is a LINESTRING, not a POLYGON. So it
> will only filter geometries that intersect the line, not geometries that are
> fully inside it.
>
> Best regards,
>
> Even
Even,
thank you for looking into my problem and for your reply. Yes I will
open a ticket in GDAL Trac.
I tried different test cases so I am sure that If I leave all other
things unchanged and just remove the use of the spatial filter I do not
see any memory leak. So it seems that the use of the spatial filter does
make the difference. Also, I tried the SQL query without the "WHERE"
clause or the "ORDER BY" clause and, again, I did not see any memory
leak. So also the use of "WHERE" clause *and* "ORDER BY" clause seem to
make the difference.
To check on Windows that memory is not released I put a "Sleep()" call
by the end of my code so I could check with Task Manager the process
memory allocation before the process quits. However, I added a
while(true) loop around my previous code:
OGRRegisterAll();
int loop_i = 1;
while ( true )
{
fprintf( stdout, "*** Loop %d... ", loop_i++ );
fflush( stdout );
// opening shapefile data source
const char *fname = "path/to/shapefile.shp";
OGRDataSource *poDS;
poDS = OGRSFDriverRegistrar::Open( fname, FALSE );
if( poDS == NULL )
{
fprintf( stderr, "%s: could not open input file\n", fname );
return -1;
}
// creating the spatial filter
OGRLineString regionPolygon;
OGRRawPoint *pointsList = new OGRRawPoint[5];
pointsList[0].x = -180; pointsList[0].y = 90;
pointsList[1].x = 180; pointsList[1].y = 90;
pointsList[2].x = 180; pointsList[2].y = -90;
pointsList[3].x = -180; pointsList[3].y = -90;
pointsList[4].x = -180; pointsList[4].y = 90;
regionPolygon.setPoints(5, pointsList);
delete[] pointsList;
// executing the query
const char *query = "SELECT * FROM table_name WHERE some_field
LIKE '%bla%' ORDER BY some_other_field";
OGRLayer *rs = poDS->ExecuteSQL( query, ®ionPolygon, NULL );
// releasing memory
poDS-> ReleaseResultSet( rs );
OGRDataSource::DestroyDataSource( poDS );
fprintf( stdout, "done\n" );
fflush( stdout );
}
and I could get my process to allocate almost 1 GBytes of memory in just
3 loops! So, yes memory keeps on being allocating without being released
at each iteration.
Related to the use of LINESTRING filter instead of a POLYGON filter, yes
you are right: it will filter out features that intersect my line but
this is what I expect from the code from which I extrapolated the
example above. However, thank you for warning me!
Thank you!
--
Pierluigi Guasqui
Applied Coherent Technology Corp.
More information about the gdal-dev
mailing list