[gdal-dev] Memory leaks when using SQL queries with shapefiles
Pierluigi Guasqui
guasqui at actgate.com
Fri Jul 13 07:22:34 PDT 2012
Even,
I opened a ticket in GDAL Trac: http://trac.osgeo.org/gdal/ticket/4749.
Thank you,
Pierluigi
> Il 13/07/2012 12.14, Even Rouault ha scritto:
>> Selon Pierluigi Guasqui <guasqui at actgate.com>:
>>
>>> Hello,
>>>
>>> I am experiencing a big memory allocation (with no subsequent memory
>>> release) when using SQL queries with shapefiles. This is a sample code
>>> that triggers this problem:
>>>
>>> OGRRegisterAll();
>>>
>>> // opening shapefile data source
>>> const char *fname = "path/to/shapefile.shp";
>>> OGRDataSource *poDS;
>>> poDS = OGRSFDriverRegistrar::Open( fname, FALSE );
>>>
>>> if( poDS == NULL )
>>> {
>>> fprintf( stderr, "%s: could not open input file\n", fname );
>>> return -1;
>>> }
>>>
>>> // creating the spatial filter
>>> OGRLineString regionPolygon;
>>> OGRRawPoint *pointsList = new OGRRawPoint[5];
>>> pointsList[0].x = -180; pointsList[0].y = 90;
>>> pointsList[1].x = 180; pointsList[1].y = 90;
>>> pointsList[2].x = 180; pointsList[2].y = -90;
>>> pointsList[3].x = -180; pointsList[3].y = -90;
>>> pointsList[4].x = -180; pointsList[4].y = 90;
>>> regionPolygon.setPoints(5, pointsList);
>>> delete[] pointsList;
>>>
>>> // executing the query
>>> const char *query = "SELECT * FROM table_name WHERE some_field
>>> LIKE
>>> '%bla%' ORDER BY some_other_field";
>>> OGRLayer *rs = poDS->ExecuteSQL( query, ®ionPolygon, NULL );
>>>
>>> // releasing memory
>>> poDS-> ReleaseResultSet( rs );
>>> OGRDataSource::DestroyDataSource( poDS );
>>>
>>> With a 100.000 records shapefile "ExecuteSQL" has allocated over 300
>>> MBytes of memory with no deallocation when calling "ReleaseResultSet".
>>>
>>> The memory leak is triggered only when "ExecuteSQL" is called with a
>>> spatial filter (even trivial as shown in the above code) and the SQL
>>> query contains a 'WHERE' clause and an 'ORDER BY' clause. Is this a
>>> bug?
>>>
>>> I am using GDAL 1.9.1 under Windows platform.
>>>
>>> Any help would be really appreciated!
>> Pierluigi,
>>
>> I have not a dev environmenet at the moment to investigate that, but
>> could you
>> open a ticket in GDAL Trac with all the details so it does not get
>> lost ?
>>
>> Are you positive that, all other things unchanged, the presence of
>> the spatial
>> filter makes a difference ? I'm not sure how you check on Windows if
>> the memory
>> is really released (if you put your code in a while(1) {} loop, does
>> the memory
>> used by the process increase at each iteration ?).
>>
>> I also suspect that your spatial filter will not do what you would
>> (probably)
>> expect from it. The geometry of the filter is a LINESTRING, not a
>> POLYGON. So it
>> will only filter geometries that intersect the line, not geometries
>> that are
>> fully inside it.
>>
>> Best regards,
>>
>> Even
>
> Even,
>
> thank you for looking into my problem and for your reply. Yes I will
> open a ticket in GDAL Trac.
>
> I tried different test cases so I am sure that If I leave all other
> things unchanged and just remove the use of the spatial filter I do
> not see any memory leak. So it seems that the use of the spatial
> filter does make the difference. Also, I tried the SQL query without
> the "WHERE" clause or the "ORDER BY" clause and, again, I did not see
> any memory leak. So also the use of "WHERE" clause *and* "ORDER BY"
> clause seem to make the difference.
>
> To check on Windows that memory is not released I put a "Sleep()" call
> by the end of my code so I could check with Task Manager the process
> memory allocation before the process quits. However, I added a
> while(true) loop around my previous code:
>
> OGRRegisterAll();
> int loop_i = 1;
> while ( true )
> {
> fprintf( stdout, "*** Loop %d... ", loop_i++ );
> fflush( stdout );
>
> // opening shapefile data source
> const char *fname = "path/to/shapefile.shp";
> OGRDataSource *poDS;
> poDS = OGRSFDriverRegistrar::Open( fname, FALSE );
>
> if( poDS == NULL )
> {
> fprintf( stderr, "%s: could not open input file\n", fname );
> return -1;
> }
>
> // creating the spatial filter
> OGRLineString regionPolygon;
> OGRRawPoint *pointsList = new OGRRawPoint[5];
> pointsList[0].x = -180; pointsList[0].y = 90;
> pointsList[1].x = 180; pointsList[1].y = 90;
> pointsList[2].x = 180; pointsList[2].y = -90;
> pointsList[3].x = -180; pointsList[3].y = -90;
> pointsList[4].x = -180; pointsList[4].y = 90;
> regionPolygon.setPoints(5, pointsList);
> delete[] pointsList;
>
> // executing the query
> const char *query = "SELECT * FROM table_name WHERE some_field
> LIKE '%bla%' ORDER BY some_other_field";
> OGRLayer *rs = poDS->ExecuteSQL( query, ®ionPolygon, NULL );
>
> // releasing memory
> poDS-> ReleaseResultSet( rs );
> OGRDataSource::DestroyDataSource( poDS );
>
> fprintf( stdout, "done\n" );
> fflush( stdout );
> }
>
> and I could get my process to allocate almost 1 GBytes of memory in
> just 3 loops! So, yes memory keeps on being allocating without being
> released at each iteration.
>
> Related to the use of LINESTRING filter instead of a POLYGON filter,
> yes you are right: it will filter out features that intersect my line
> but this is what I expect from the code from which I extrapolated the
> example above. However, thank you for warning me!
>
> Thank you!
>
--
Pierluigi Guasqui
Applied Coherent Technology Corp.
More information about the gdal-dev
mailing list