[postgis-users] where to register to submit bug?
Stephen Woodbridge
woodbri at swoodbridge.com
Thu Jun 19 06:42:27 PDT 2008
Mark Cave-Ayland wrote:
> Dave Fuhry wrote:
>> Mark,
>>
>> I'm beginning to wonder if the stricter-EWKB-parsing patch applied
>> in November was a mistake.
>>
>> I have an app which bulk-loads shapefiles (of varying quality),
>> then "repairs" or NULLs geometries which are not isvalid(). I'm not
>> finding a good way to bulk-load input data when the dataset has a
>> record which causes:
>>
>> ERROR: geometry contains non-closed rings
>>
>> COPY (shp2pgsql -D) is out, since COPY aborts on error. From
>> discussions on pgsql-dev, it is not clear whether COPY will support a
>> "SKIP ERRORS" or "ERRORS TO error_table" clause anytime soon. Even in
>> that case, I would like a convenient way to keep the table's other
>> (non-geometry) attributes.
>>
>> For shp2pgsql's insert-statement mode, records are grouped into
>> 250-record batches surrounded by BEGIN; ... END;, so an erroneous
>> record will abort the 250 records in its batch. Removing transactions
>> entirely is no good for bulk-loading, since the database will be
>> forced to commit every record to disk before processing the next.
>>
>> Another option would be to move EWKB parsing logic to shp2pgsql so
>> that shp2pgsql can decide how to handle erroneous geometries. This
>> option seems ugly and redundant to me, although I'll defer judgement.
>>
>> Lastly, maybe some per-session option to allow postgis to import
>> erroneous geometries is in order. Then they can be corrected in a
>> controlled fashion by isvalid() queries. I'm somewhat preferential
>> towards the geometry processing functions (in the below example,
>> st_simplify()) being robust in the face of questionable geometry
>> anyway. Thoughts?
>>
>> Thanks,
>>
>> Dave
>
>
> Hi Dave,
>
> That's an interesting one. The problem wasn't so much to do with
> allowing valid/invalid geometries rather than to make the behaviour
> consistent between WKT and WKB inputs.
>
> This brings back the whole issue as to how strict we should be when
> accepting data. We could argue that the database should be quite strict
> as to which geometries are accepted, but then again we have the (rather
> expensive) IsValid() function which indicates whether a geometry meets
> the extra criteria for the GEOS functions.
>
> I think at the end of the day it comes down to: what does the OGC spec
> say and what do other databases do? I'd prefer to stick to the letter of
> the spec wherever possible. We could potentially look at altering
> shp2pgsql to use the geometry parser so that erroneous geometries are
> written out to a separate shapefile if that helps. However at the moment
> it's quite far down on the TODO list unless anyone wants to sponsor a
> developer to work on it.
>
>
> ATB,
>
> Mark.
>
Similarly, some applications like UNM Mapserver CAN use geometries that
are NOT IsValid(). While I mostly use shapefiles to load data, it would
be bad to loose the ability to load geometries that are are not good. I
think having IsValid() is sufficient to sort out the good from the bad.
My 2 cents,
-Steve
More information about the postgis-users
mailing list