[gdal-dev] With Python bindings, should geometry objects created by ogr.Geometry() be explicitly destroyed?

Jason Roberts jason.roberts at duke.edu
Mon Sep 20 12:29:26 EDT 2010


Even,

Thank you very much for your comments. Based on them, I started a new "Python Gotchas" page here http://trac.osgeo.org/gdal/wiki/PythonGotchas, referenced from http://trac.osgeo.org/gdal/wiki/GdalOgrInPython. Please review it to make sure I got everything right.

Best regards,

Jason

-----Original Message-----
From: Even Rouault [mailto:even.rouault at mines-paris.org] 
Sent: Saturday, September 18, 2010 5:52 AM
To: gdal-dev at lists.osgeo.org
Cc: Jason Roberts
Subject: Re: [gdal-dev] With Python bindings, should geometry objects created by ogr.Geometry() be explicitly destroyed?

Jason,

> I have some Python code that uses OGR geometry objects internally, creating
> them like this:
> 
> 
> 
> point = ogr.Geometry(ogr.wkbPoint)
> 
> 
> 
> Does this code need to explicitly destroy these geometries, like the
> following, to avoid leaks, or can it simply allow them to go out of scope
> and have Python's reference counting and garbage collector clean them up?
> 
> 
> 
> point.Destroy()

There's no reason to call Destroy(), at all. Native object gets destroyed when 
Python object goes out of scope, or when they are assigned to None. So replace 
foo.Destroy() by foo = None if you really want to control when the underlying 
C++ object is destroyed. This is indeed generally only interesting for I/O 
objects, like a GDALDataset or OGRDataSource, when you want to control when 
the file handle is closed.

ds = gdal.GetDriverByName('GTiff').Create('foo.tif', 10, 10, 1)
ds.GetRasterBand(1).Fill(255)
ds = None # Now, foo.tif is closed and I can safely copy it
shutil.copy('foo.tif', 'other.tif')

> 
> 
> 
> Does the situation change if I call feature.SetGeometry(point)? What about
> feature.SetGeometryDirectly(point)?

No, you don't need to call Destroy() either. SetGeometryDirectly() brings a 
bit of subtelty though : it makes the (C++) feature object the owner of the 
(C++) geometry object (whereas SetGeometry() just clones the passed geometry). 
The consequence for Python side is then when feature goes out of scope, C++ 
feature object dies and the C++ geometry object too, so the Python point 
object just referes to a dead C++ object.

So do *NOT* do that :

point = ogr.Geometry(ogr.wkbPoint)
feature = ogr.Feature(layer_defn)
feature.SetGeometryDirectly(point)
feature = None # point becomes implicitely invalid at that point
print point.ExportToWkt() # here you die

The issue is that the bindings here expose something that is probably 
manageable for C++ programmers used to take into account object lifetime, but 
in Python, this can have surprising effect...

> 
> 
> 
> I'm sorry for my ignorance here. I found a nice GDAL tutorial that seems to
> say they *should* be explicitly destroyed in certain circumstances (see
> http://www.gis.usu.edu/~chrisg/python/2009/lectures/ospy_slides2.pdf, page
> 12). But I have not really seen any other examples of this.
> 

Destroy() was perhaps necessary with old-gen bindings, but I'm not even sure 
of that... Perhaps this shouldn't have been exposed at all... But, as 
mentionned in the slides, it is true that there are situations where you 
shouldn't call Destroy() at all.

Do *NOT* do this :

feat = lyr.GetNextFeature()
geom = feat.GetGeometryRef()
geom.Destroy() # Arg! C++ feature still points at the underlying C++ geometry, 
but is is now dead
feat = None # this will destroy the C++ feature, that will try to destroy the 
C++ geometry, but this one is already destroyed --> segfault !

> 
> 
> If you follow gdal-dev, you probably notice that the "to destroy or not to
> destroy" question comes up periodically from Python programmers, as does
> the problem where someone gets a GDALDataset, gets a band, allows the
> GDALDataset to go out of scope, and then expects the band to still work.
> These are areas where the GDAL Python bindings to not work like other
> Python libraries, causing confusion among Python programmers, who
> generally expect reference counting that "just works" and that they only
> need to explicitly close something when some kind of I/O is involved.
> 

Agreed.

I wouldn't say that is a problem of reference counting, but more a problem of 
ownership and lifetime of C++ objects, that are sometimes destroyed when a 
Python object get out of scope, but shouldn't since another Python object 
indirectly still need them to be alive.

Do *NOT* do this :

feat = lyr.GetNextFeature()
geom = feat.GetGeometryRef()
feat = None # This kills the C++ feature, and its C++ geometry
print(geom.ExportToWkt()) # segfault here. The C++ geometry no longer exist

Byt you could do (and this is what the patch in ticket 3592 I mention below 
does under the hood):

feat = lyr.GetNextFeature()
geom = feat.GetGeometryRef()
geom.please_keep_alive_object_after_equal_sign = feat
feat = None # The underlying C++ object is still alive, since geom keeps the 
python feature object alive
print(geom.ExportToWkt())


Similarly, do *NOT* do this :

ds = gdal.Open('foo.tif')
band = ds.GetRasterBand(1)
ds = None
print(band.Checksum())

But you could do :

ds = gdal.Open('foo.tif')
band = ds.GetRasterBand(1)
band.please_keep_alive_object_after_equal_sign = ds
ds = None
print(band.Checksum())


Do *NOT* do this also :

print(gdal.Open('foo.tif').GetRasterBand(1).Checksum())

> 
> 
> It would be really great if there was FAQ that pointed out all of the ways
> that GDAL differs from typical Python libraries. The scenarios I mention
> above seem to be two of the most common ones. Would someone be willing to
> create it? I would but do not feel competent enough at this point.
> 
> 

If you start a wiki page collecting all the info you found in the mailing list 
threads, that would be a beginning...

Obviously, instead of documenting weird behaviour a better approach would be 
to fix the code so it behaves as expected...

There was an attempt I pursued in http://trac.osgeo.org/gdal/ticket/3592 , 
that would solve all the crashes in the above examples, but as you can see in 
the ticket, it raises backward compatibility issues in some cases (including 
the GDAL/OGR python autotest suite)... So the question is a bit like : should 
we inflict pain to existing users (in a few scenarios) to make the life of new 
users easier... ?

But the patch doesn't solve all cases. Unfortunately, there would be still 
ways of causing segfault from Python. You can break some subtle API contracts, 
like adding a new field to a layer when features deriving from this layer 
definition are still active. And it isn't obvious how we could detect such 
situations to prevent them from happening (see 
http://trac.osgeo.org/gdal/ticket/3552 )

Do *NOT* do that :

feature = lyr.GetNextFeature()
field_defn = ogr.FieldDefn("foo", ogr.OFTString)
lyr.CreateField(field_defn) # now, existing features deriving from this layer 
are invalid
feature.DumpReadable() # segfault

> 
> Thanks,
> 
> Jason



More information about the gdal-dev mailing list