[postgis-users] where to register to submit bug?

Paul Ramsey pramsey at cleverelephant.ca
Fri Jun 20 01:24:03 PDT 2008


This is a well-chewed issue, and we have come down repeatedly in
favour of allowing invalid geometry to be loaded. Which, I suppose,
means we should loosen the parser to allow even non-closed rings.

What *does* need to be done (issue for me, it's not that hard) is a
few additional hooks to the GEOS isvalid routines so people can easily
identify WHY and WHERE their geometries are invalid.

More difficult, probably need in of some funding to get Martin's
attention, is some formalized cleaning routines that go beyond what
Buffer(0) can accomplish.

P.

On Thu, Jun 19, 2008 at 3:42 PM, Stephen Woodbridge
<woodbri at swoodbridge.com> wrote:
> Mark Cave-Ayland wrote:
>>
>> Dave Fuhry wrote:
>>>
>>> Mark,
>>>
>>>   I'm beginning to wonder if the stricter-EWKB-parsing patch applied
>>> in November was a mistake.
>>>
>>>   I have an app which bulk-loads shapefiles (of varying quality),
>>> then "repairs" or NULLs geometries which are not isvalid().  I'm not
>>> finding a good way to bulk-load input data when the dataset has a
>>> record which causes:
>>>
>>> ERROR:  geometry contains non-closed rings
>>>
>>> COPY (shp2pgsql -D) is out, since COPY aborts on error.  From
>>> discussions on pgsql-dev, it is not clear whether COPY will support a
>>> "SKIP ERRORS" or "ERRORS TO error_table" clause anytime soon.  Even in
>>> that case, I would like a convenient way to keep the table's other
>>> (non-geometry) attributes.
>>>
>>> For shp2pgsql's insert-statement mode, records are grouped into
>>> 250-record batches surrounded by BEGIN; ... END;, so an erroneous
>>> record will abort the 250 records in its batch.  Removing transactions
>>> entirely is no good for bulk-loading, since the database will be
>>> forced to commit every record to disk before processing the next.
>>>
>>> Another option would be to move EWKB parsing logic to shp2pgsql so
>>> that shp2pgsql can decide how to handle erroneous geometries.  This
>>> option seems ugly and redundant to me, although I'll defer judgement.
>>>
>>> Lastly, maybe some per-session option to allow postgis to import
>>> erroneous geometries is in order.  Then they can be corrected in a
>>> controlled fashion by isvalid() queries.  I'm somewhat preferential
>>> towards the geometry processing functions (in the below example,
>>> st_simplify()) being robust in the face of questionable geometry
>>> anyway.  Thoughts?
>>>
>>> Thanks,
>>>
>>> Dave
>>
>>
>> Hi Dave,
>>
>> That's an interesting one. The problem wasn't so much to do with allowing
>> valid/invalid geometries rather than to make the behaviour consistent
>> between WKT and WKB inputs.
>>
>> This brings back the whole issue as to how strict we should be when
>> accepting data. We could argue that the database should be quite strict as
>> to which geometries are accepted, but then again we have the (rather
>> expensive) IsValid() function which indicates whether a geometry meets the
>> extra criteria for the GEOS functions.
>>
>> I think at the end of the day it comes down to: what does the OGC spec say
>> and what do other databases do? I'd prefer to stick to the letter of the
>> spec wherever possible. We could potentially look at altering shp2pgsql to
>> use the geometry parser so that erroneous geometries are written out to a
>> separate shapefile if that helps. However at the moment it's quite far down
>> on the TODO list unless anyone wants to sponsor a developer to work on it.
>>
>>
>> ATB,
>>
>> Mark.
>>
>
> Similarly, some applications like UNM Mapserver CAN use geometries that are
> NOT IsValid(). While I mostly use shapefiles to load data, it would be bad
> to loose the ability to load geometries that are are not good. I think
> having IsValid() is sufficient to sort out the good from the bad.
>
> My 2 cents,
>  -Steve
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>



More information about the postgis-users mailing list