[postgis-users] where to register to submit bug?

Mark Cave-Ayland mark.cave-ayland at siriusit.co.uk
Thu Jun 19 03:43:02 PDT 2008


Dave Fuhry wrote:
> Mark,
> 
>    I'm beginning to wonder if the stricter-EWKB-parsing patch applied
> in November was a mistake.
> 
>    I have an app which bulk-loads shapefiles (of varying quality),
> then "repairs" or NULLs geometries which are not isvalid().  I'm not
> finding a good way to bulk-load input data when the dataset has a
> record which causes:
> 
> ERROR:  geometry contains non-closed rings
> 
> COPY (shp2pgsql -D) is out, since COPY aborts on error.  From
> discussions on pgsql-dev, it is not clear whether COPY will support a
> "SKIP ERRORS" or "ERRORS TO error_table" clause anytime soon.  Even in
> that case, I would like a convenient way to keep the table's other
> (non-geometry) attributes.
> 
> For shp2pgsql's insert-statement mode, records are grouped into
> 250-record batches surrounded by BEGIN; ... END;, so an erroneous
> record will abort the 250 records in its batch.  Removing transactions
> entirely is no good for bulk-loading, since the database will be
> forced to commit every record to disk before processing the next.
> 
> Another option would be to move EWKB parsing logic to shp2pgsql so
> that shp2pgsql can decide how to handle erroneous geometries.  This
> option seems ugly and redundant to me, although I'll defer judgement.
> 
> Lastly, maybe some per-session option to allow postgis to import
> erroneous geometries is in order.  Then they can be corrected in a
> controlled fashion by isvalid() queries.  I'm somewhat preferential
> towards the geometry processing functions (in the below example,
> st_simplify()) being robust in the face of questionable geometry
> anyway.  Thoughts?
> 
> Thanks,
> 
> Dave


Hi Dave,

That's an interesting one. The problem wasn't so much to do with 
allowing valid/invalid geometries rather than to make the behaviour 
consistent between WKT and WKB inputs.

This brings back the whole issue as to how strict we should be when 
accepting data. We could argue that the database should be quite strict 
as to which geometries are accepted, but then again we have the (rather 
expensive) IsValid() function which indicates whether a geometry meets 
the extra criteria for the GEOS functions.

I think at the end of the day it comes down to: what does the OGC spec 
say and what do other databases do? I'd prefer to stick to the letter of 
the spec wherever possible. We could potentially look at altering 
shp2pgsql to use the geometry parser so that erroneous geometries are 
written out to a separate shapefile if that helps. However at the moment 
it's quite far down on the TODO list unless anyone wants to sponsor a 
developer to work on it.


ATB,

Mark.

-- 
Mark Cave-Ayland
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk
T: +44 870 608 0063



More information about the postgis-users mailing list