[postgis-devel] Removing geometry parsing checks for data loading
Mark Cave-Ayland
mark.cave-ayland at siriusit.co.uk
Fri Jul 4 02:23:42 PDT 2008
David Fuhry wrote:
> $ shp2pgsql -D poly_1.shp poly_1 | psql
> ...
> ERROR: geometry contains non-closed rings
> CONTEXT: COPY poly_1, line 389, column the_geom:
> "01060000000100000001030000000100000007000000F06044A5C0B53E41086D2D657C8C23413D5FFD71BFB53E4191925462..."
>
> ROLLBACK
>
> What's the best way to drop these geometry-parsing checks (from
> lwgparse.c, I assume)? Is commenting out the if (...) error(...); lines
> in popc() sufficient and safe?
>
> Obviously as part of my import routine, I check that the geometries are
> isvalid() later. For my uses, geometry errors are best resolved with
> custom logic /after/ the data is in the db.
>
> If Mark and others have serious concerns about people naively using
> non-isvalid() geometries, then I suggest a configurable parameter to
> shp2pgsql (on by default) which adds an isvalid() check constraint to
> the table. The call to AddGeometryColumn() already adds ndims,
> geometrytype, and srid checks to the table; why not the equally
> important isvalid() too?
>
> Thanks,
>
> Dave Fuhry
Hi Dave,
I think I may not have explained myself clearly enough here. When I
fixed the WKB parser, I didn't add any additional calls into isvalid().
IsValid() is a particularly heavyweight function which is why currently
this is left up to the user to decide whether they wish to use it or
not. This is because it performs various intersection and repeated point
tests which can involve lots of time-consuming matrix maths.
Now the patch I added to the 1.3 series was to make the WKT and WKB
parsers use the same validation code. The only checks that the WK(T/B)
parser makes for polygons are i) ensure that polygon rings have a
minimum of 3 points and ii) ensure that polygon rings are closed (i.e.
the first point equals the last point). Both of these do seem entirely
sane checks to be making.
I've done a quick check of the SFS specification
(http://www.opengis.org/docs/99-049.pdf) and the ESRI shapefile
specification
(http://www.esri.com/library/whitepapers/pdfs/shapefile.pdf) and both of
these mandate that polygon rings *must* be closed. And to be honest, I
can see that this makes sense - for example, think about how should
things behave if you try to calculate the area of an open polygon, for
example.
So in other words, whoever (or whatever) is generating your shapefiles
is generating invalid shapefiles according to spec, and I'm not yet
convinced that this is a good enough reason to start allowing geometries
with open rings into the database.
If you really are stuck, I've attached a patch against 1.3 series that
will turn off the check for WKB, but it will take a lot more convincing
in order for this (or other relevant fix) to get into SVN.
ATB,
Mark.
--
Mark Cave-Ayland
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk
T: +44 870 608 0063
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lwgparse.patch
Type: text/x-diff
Size: 441 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20080704/9895cc75/attachment.patch>
More information about the postgis-devel
mailing list