[postgis-devel] Removing geometry parsing checks for data loading

Mark Cave-Ayland mark.cave-ayland at siriusit.co.uk
Fri Jul 4 02:23:42 PDT 2008


David Fuhry wrote:
> $ shp2pgsql -D poly_1.shp poly_1 | psql
> ...
> ERROR:  geometry contains non-closed rings
> CONTEXT:  COPY poly_1, line 389, column the_geom:
> "01060000000100000001030000000100000007000000F06044A5C0B53E41086D2D657C8C23413D5FFD71BFB53E4191925462..." 
> 
> ROLLBACK
> 
> What's the best way to drop these geometry-parsing checks (from
> lwgparse.c, I assume)?  Is commenting out the if (...) error(...); lines
> in popc() sufficient and safe?
> 
> Obviously as part of my import routine, I check that the geometries are
> isvalid() later.  For my uses, geometry errors are best resolved with
> custom logic /after/ the data is in the db.
> 
> If Mark and others have serious concerns about people naively using
> non-isvalid() geometries, then I suggest a configurable parameter to
> shp2pgsql (on by default) which adds an isvalid() check constraint to
> the table.  The call to AddGeometryColumn() already adds ndims,
> geometrytype, and srid checks to the table; why not the equally
> important isvalid() too?
> 
> Thanks,
> 
> Dave Fuhry


Hi Dave,

I think I may not have explained myself clearly enough here. When I 
fixed the WKB parser, I didn't add any additional calls into isvalid(). 
IsValid() is a particularly heavyweight function which is why currently 
this is left up to the user to decide whether they wish to use it or 
not. This is because it performs various intersection and repeated point 
tests which can involve lots of time-consuming matrix maths.

Now the patch I added to the 1.3 series was to make the WKT and WKB 
parsers use the same validation code. The only checks that the WK(T/B) 
parser makes for polygons are i) ensure that polygon rings have a 
minimum of 3 points and ii) ensure that polygon rings are closed (i.e. 
the first point equals the last point). Both of these do seem entirely 
sane checks to be making.

I've done a quick check of the SFS specification 
(http://www.opengis.org/docs/99-049.pdf) and the ESRI shapefile 
specification 
(http://www.esri.com/library/whitepapers/pdfs/shapefile.pdf) and both of 
these mandate that polygon rings *must* be closed. And to be honest, I 
can see that this makes sense - for example, think about how should 
things behave if you try to calculate the area of an open polygon, for 
example.

So in other words, whoever (or whatever) is generating your shapefiles 
is generating invalid shapefiles according to spec, and I'm not yet 
convinced that this is a good enough reason to start allowing geometries 
with open rings into the database.

If you really are stuck, I've attached a patch against 1.3 series that 
will turn off the check for WKB, but it will take a lot more convincing 
in order for this (or other relevant fix) to get into SVN.


ATB,

Mark.

-- 
Mark Cave-Ayland
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk
T: +44 870 608 0063
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lwgparse.patch
Type: text/x-diff
Size: 441 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20080704/9895cc75/attachment.patch>


More information about the postgis-devel mailing list