[gdal-dev] Append a shapefile to a postgis table using the GDAL/OGR Python interface

Even Rouault even.rouault at mines-paris.org
Wed Jul 6 13:35:05 EDT 2011


Le mercredi 06 juillet 2011 17:11:31, Luke Peterson a écrit :

> I'm reviving a nearly-year-old thread here, sort of. I'm trying to
> create cross-compatibility between MSSQL and PostGIS on some OGR-based
> code I wrote in Python for the MSSQL environment: I had been running
> into issues simply creating layers with OGR -- would run something
> like this:
> 
> name =
> serverDS.CopyLayer(shapeDS.GetLayerByIndex(0),table,options).GetName()
> 
> with serverDS being an open OGR datastore (destination) and shapeDS
> being an opened Shapefile (with an OGR-imposed FID).
> 
> This code works fine when serverDS is a MSSQLSpatial driver, but I get
> hit with the following error when I run with a PostgresSQL driver:
> 
> "You've inserted feature with an already set FID and that's perhaps
> the reason for the failure. If so, this can happen if you reuse the
> same feature object for sequential insertions. Indeed, since GDAL
> 1.8.0, the FID of an inserted feature is got from the server, so it is
> not a good ideato reuse it afterwards... All in all, try unsetting the
> FID with SetFID(-1) before calling CreateFeature()"
> 
> OK, clear enough. I took a look at the DataStore code here and here:
> 
> http://trac.osgeo.org/gdal/browser/branches/1.8/gdal/ogr/ogrsf_frmts/pg/ogr
> pgdatasource.cpp
> 
> http://trac.osgeo.org/gdal/browser/branches/1.8/gdal/ogr/ogrsf_frmts/generi
> c/ogrdatasource.cpp
> 
> I noticed that the PG driver has no PG-native code for CopyLayer() at
> all; the generic driver's CopyLayer() function doesn't set a null FID
> for CreateFeature(). So, it seems CopyLayer() is useless for PostGIS
> in my situation.

yes indeed, CopyLayer() isn't currently appropriate for your use case. It does 
currently a poDstFeature.SetFID(poSrcFeature.GetFID()), which is the 
equivalent of the -preserve_fid option of ogr2ogr. In the append use case, this 
is not desirable because you will get collisions with features already 
inserted with the FID. The solution will be easy to implement and consists in 
adding an option "PRESERVE_FID" to CopyLayer(). No need to create a 
specialized implementation for the PG driver, that issue could be found in 
other drivers that can use the FID of features set to CreateFeature(). Please 
file a ticket about that.

> 
> The code I'm going with right now uses CopyLayer() out of the box
> except if serverDS.GetDriver().GetName() == 'PostgreSQL', in which
> case I'm re-implementing most of the logic from CopyLayer() in Python
> except adding a SetFID(-1) call:
> 
>         newLayer =
> serverDS.CreateLayer(table,shapeDS.GetLayerByIndex(0).GetSpatialRef(),ogr.w
> kbUnknown,options) for x in
> xrange(shapeDS.GetLayerByIndex(0).GetLayerDefn().GetFieldCount()):
>            
> newLayer.CreateField(shapeDS.GetLayerByIndex(0).GetLayerDefn().GetFieldDef
> n(x))
> 
>         newLayer.StartTransaction()
>         for x in xrange(shapeDS.GetLayerByIndex(0).GetFeatureCount()):
>             newFeature = shapeDS.GetLayerByIndex(0).GetFeature(x)
>             newFeature.SetFID(-1)
>             newLayer.CreateFeature(newFeature)
>             if x % 128 == 0:
>                 newLayer.CommitTransaction()
>                 newLayer.StartTransaction()
>         newLayer.CommitTransaction()
> 

Several issues with your code :
	* it is risky to pass to CreateFeature() a feature from another layer. In 
cases the layer definition of the source and target layers are not strictly 
identical, the probability of having crashes is really high. Mismatch can 
easily happen when drivers don't support the exact data type of the source and 
choose a best fit (OFTString) for example. The safe procedure is to instantiate 
a new feature with the layer definition of the target layer and to use 
CopyFrom() to copy the content of the source feature to the target feature
	* major performance issue : GetFeature() is generally a slow operation 
(the shapefile driver being mostly the exception). It is better to iterate over 
the features with GetNextFeature().
	* minor/style performance issue : You could also affect  
shapeDS.GetLayerByIndex(0) to a variable to avoid a useless call to native 
code in the loop.

So I'd suggest something like :

    sourceLayer = shapeDS.GetLayerByIndex(0)
    sourceFeature = sourceLayer.GetNextFeature()
    newLayerDefn = newLayer.GetLayerDefn()
    x = 0
    while sourceFeature is not None:
        newFeature = ogr.Feature(newLayerDefn)
        newFeature.SetFrom(sourceFeature) # does not preserve source FID by 
default, what we want here
        newLayer.CreateFeature(newFeature)
        if x % 128 == 0:
            newLayer.CommitTransaction()
            newLayer.StartTransaction()

        sourceFeature = sourceLayer.GetNextFeature()
        x = x + 1

    newLayer.CommitTransaction()

Best regards,

Even


More information about the gdal-dev mailing list