[gdal-dev] With Python bindings, should geometry objects created by ogr.Geometry() be explicitly destroyed?

Jason Roberts jason.roberts at duke.edu
Mon Sep 20 15:05:52 EDT 2010


Hi Even and Chris,

Thanks for the review and edits. Feel free to edit as needed to reflect the facts as seen by yourself and the GDAL team.

I replaced the reference to "C++" with "C" for numpy. Hopefully it reflects reality now. I work mostly at the Python level these days and sometimes gloss over the detail of whether the underlying stuff is C or C++. :-)

I am aware of the problem with numpy 1.4.1, where they introduced some advanced date/time support but it broke the ABI so they retracted the change and deferred it (I think to numpy 2.0). I found that 1.3.0 is not forward compatible with extension modules compiled against 1.4.1, at least on win32. I do not recall the exact problem--just diffed the 1.3.0 headers with 1.4.1, but could not immediately spot it. Unfortunately this caused a problem for my application. My app integrates with ArcGIS, and ArcGIS 10 shipped with numpy 1.3.0. I like to keep up with numpy versions but found that I could not compile the GDAL bindings against 1.4.1 and have them work on 1.3.0.

Of course, the numpy team may not prioritize this kind of compatibility; perhaps if I had compiled GDAL's bindings against 1.3.0 they would have worked fine with 1.4.1. If you know the details and care to update the page, please go for it...

Thanks again,

Jason

-----Original Message-----
From: Christopher Barker [mailto:Chris.Barker at noaa.gov] 
Sent: Monday, September 20, 2010 2:32 PM
To: Even Rouault
Cc: Jason Roberts; gdal-dev at lists.osgeo.org
Subject: Re: [gdal-dev] With Python bindings, should geometry objects created by ogr.Geometry() be explicitly destroyed?

Even Rouault wrote:
> The part where I'm not sure if about the Numpy ABI problems. I wasn't very 
> aware of problems related to it. From what I've seen in the code, it looks 
> like that the python bindings only use the C API/ABI of Numpy.

yup -- there is no C++ in numpy.

 > IMHO. However, I haven't changed anything in this section as I'm not
> 100% sure of my analysis.

aside from C rather than C++, it's all correct.

However, the numpy developers try to keep the ABI compatible as much as 
possible. A little while back a numpy version (I think 1.4.0) was 
released with binary incompatibility, somewhat by accident - it ended up 
getting retracted and an ABI compatible version released in its place. 
However, those folks that re-built something right in those few weeks 
hit the problem.

If you look at the sourceforge download site, you'll see a 1.4.1, but no 
1.4.0, for that reason.

So this is a potential issue, but not a common one.

I'm pretty sure that 1.5 is ABI backward compatible with 1.4.1 and 1.3.* 
at least, and maybe back a lot further.

2.0 may be incompatible, that's why it's called 2.0

-Chris





> Best regards,
> 
> Even
> 
> Le lundi 20 septembre 2010 18:29:26, Jason Roberts a écrit :
>> Even,
>>
>> Thank you very much for your comments. Based on them, I started a new
>> "Python Gotchas" page here http://trac.osgeo.org/gdal/wiki/PythonGotchas,
>> referenced from http://trac.osgeo.org/gdal/wiki/GdalOgrInPython. Please
>> review it to make sure I got everything right.
>>
>> Best regards,
>>
>> Jason
>>
>> -----Original Message-----
>> From: Even Rouault [mailto:even.rouault at mines-paris.org]
>> Sent: Saturday, September 18, 2010 5:52 AM
>> To: gdal-dev at lists.osgeo.org
>> Cc: Jason Roberts
>> Subject: Re: [gdal-dev] With Python bindings, should geometry objects
>> created by ogr.Geometry() be explicitly destroyed?
>>
>> Jason,
>>
>>> I have some Python code that uses OGR geometry objects internally,
>>> creating them like this:
>>>
>>>
>>>
>>> point = ogr.Geometry(ogr.wkbPoint)
>>>
>>>
>>>
>>> Does this code need to explicitly destroy these geometries, like the
>>> following, to avoid leaks, or can it simply allow them to go out of scope
>>> and have Python's reference counting and garbage collector clean them up?
>>>
>>>
>>>
>>> point.Destroy()
>> There's no reason to call Destroy(), at all. Native object gets destroyed
>> when Python object goes out of scope, or when they are assigned to None.
>> So replace foo.Destroy() by foo = None if you really want to control when
>> the underlying C++ object is destroyed. This is indeed generally only
>> interesting for I/O objects, like a GDALDataset or OGRDataSource, when you
>> want to control when the file handle is closed.
>>
>> ds = gdal.GetDriverByName('GTiff').Create('foo.tif', 10, 10, 1)
>> ds.GetRasterBand(1).Fill(255)
>> ds = None # Now, foo.tif is closed and I can safely copy it
>> shutil.copy('foo.tif', 'other.tif')
>>
>>> Does the situation change if I call feature.SetGeometry(point)? What
>>> about feature.SetGeometryDirectly(point)?
>> No, you don't need to call Destroy() either. SetGeometryDirectly() brings a
>> bit of subtelty though : it makes the (C++) feature object the owner of the
>> (C++) geometry object (whereas SetGeometry() just clones the passed
>> geometry). The consequence for Python side is then when feature goes out
>> of scope, C++ feature object dies and the C++ geometry object too, so the
>> Python point object just referes to a dead C++ object.
>>
>> So do *NOT* do that :
>>
>> point = ogr.Geometry(ogr.wkbPoint)
>> feature = ogr.Feature(layer_defn)
>> feature.SetGeometryDirectly(point)
>> feature = None # point becomes implicitely invalid at that point
>> print point.ExportToWkt() # here you die
>>
>> The issue is that the bindings here expose something that is probably
>> manageable for C++ programmers used to take into account object lifetime,
>> but in Python, this can have surprising effect...
>>
>>> I'm sorry for my ignorance here. I found a nice GDAL tutorial that seems
>>> to say they *should* be explicitly destroyed in certain circumstances
>>> (see
>>> http://www.gis.usu.edu/~chrisg/python/2009/lectures/ospy_slides2.pdf,
>>> page 12). But I have not really seen any other examples of this.
>> Destroy() was perhaps necessary with old-gen bindings, but I'm not even
>> sure of that... Perhaps this shouldn't have been exposed at all... But, as
>> mentionned in the slides, it is true that there are situations where you
>> shouldn't call Destroy() at all.
>>
>> Do *NOT* do this :
>>
>> feat = lyr.GetNextFeature()
>> geom = feat.GetGeometryRef()
>> geom.Destroy() # Arg! C++ feature still points at the underlying C++
>> geometry, but is is now dead
>> feat = None # this will destroy the C++ feature, that will try to destroy
>> the C++ geometry, but this one is already destroyed --> segfault !
>>
>>> If you follow gdal-dev, you probably notice that the "to destroy or not
>>> to destroy" question comes up periodically from Python programmers, as
>>> does the problem where someone gets a GDALDataset, gets a band, allows
>>> the GDALDataset to go out of scope, and then expects the band to still
>>> work. These are areas where the GDAL Python bindings to not work like
>>> other Python libraries, causing confusion among Python programmers, who
>>> generally expect reference counting that "just works" and that they only
>>> need to explicitly close something when some kind of I/O is involved.
>> Agreed.
>>
>> I wouldn't say that is a problem of reference counting, but more a problem
>> of ownership and lifetime of C++ objects, that are sometimes destroyed
>> when a Python object get out of scope, but shouldn't since another Python
>> object indirectly still need them to be alive.
>>
>> Do *NOT* do this :
>>
>> feat = lyr.GetNextFeature()
>> geom = feat.GetGeometryRef()
>> feat = None # This kills the C++ feature, and its C++ geometry
>> print(geom.ExportToWkt()) # segfault here. The C++ geometry no longer exist
>>
>> Byt you could do (and this is what the patch in ticket 3592 I mention below
>> does under the hood):
>>
>> feat = lyr.GetNextFeature()
>> geom = feat.GetGeometryRef()
>> geom.please_keep_alive_object_after_equal_sign = feat
>> feat = None # The underlying C++ object is still alive, since geom keeps
>> the python feature object alive
>> print(geom.ExportToWkt())
>>
>>
>> Similarly, do *NOT* do this :
>>
>> ds = gdal.Open('foo.tif')
>> band = ds.GetRasterBand(1)
>> ds = None
>> print(band.Checksum())
>>
>> But you could do :
>>
>> ds = gdal.Open('foo.tif')
>> band = ds.GetRasterBand(1)
>> band.please_keep_alive_object_after_equal_sign = ds
>> ds = None
>> print(band.Checksum())
>>
>>
>> Do *NOT* do this also :
>>
>> print(gdal.Open('foo.tif').GetRasterBand(1).Checksum())
>>
>>> It would be really great if there was FAQ that pointed out all of the
>>> ways that GDAL differs from typical Python libraries. The scenarios I
>>> mention above seem to be two of the most common ones. Would someone be
>>> willing to create it? I would but do not feel competent enough at this
>>> point.
>> If you start a wiki page collecting all the info you found in the mailing
>> list threads, that would be a beginning...
>>
>> Obviously, instead of documenting weird behaviour a better approach would
>> be to fix the code so it behaves as expected...
>>
>> There was an attempt I pursued in http://trac.osgeo.org/gdal/ticket/3592 ,
>> that would solve all the crashes in the above examples, but as you can see
>> in the ticket, it raises backward compatibility issues in some cases
>> (including the GDAL/OGR python autotest suite)... So the question is a bit
>> like : should we inflict pain to existing users (in a few scenarios) to
>> make the life of new users easier... ?
>>
>> But the patch doesn't solve all cases. Unfortunately, there would be still
>> ways of causing segfault from Python. You can break some subtle API
>> contracts, like adding a new field to a layer when features deriving from
>> this layer definition are still active. And it isn't obvious how we could
>> detect such situations to prevent them from happening (see
>> http://trac.osgeo.org/gdal/ticket/3552 )
>>
>> Do *NOT* do that :
>>
>> feature = lyr.GetNextFeature()
>> field_defn = ogr.FieldDefn("foo", ogr.OFTString)
>> lyr.CreateField(field_defn) # now, existing features deriving from this
>> layer are invalid
>> feature.DumpReadable() # segfault
>>
>>> Thanks,
>>>
>>> Jason
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov



More information about the gdal-dev mailing list