[postgis-devel] SVN trunk parser modifications stage 2

Mark Cave-Ayland mark.cave-ayland at siriusit.co.uk
Wed Oct 29 04:00:17 PDT 2008


Chris Hodgson wrote:
> The point is, that if you allow things to be loaded using "0-level" no 
> checking, regardless of how it gets into the database there will be a 
> (reasonable) expectation to be able to to use pg_dump to dump and reload 
> it. Thus if your default parsing level does any checks at all, it will 
> potentially fail to load data that had been loaded without checks.
> 
> If we can use the postgresql-variable based approach, all it would take 
> is a single
> 
> SET postgis.default_parser_check_level TO 0;
> 
> at the top of your dump file to fix this (which is still more work than 
> some would find acceptable). With the approach that requires using a 
> special function call to load without checks, it requires a lot more 
> work to fix pg_dump output into something that can be reloaded - and it 
> has to use full INSERTs, not COPY.
> 
> Although, I believe it is possible (perhaps even Mark's intention) for 
> the _in and _out functions to default to 0-level checking, while the 
> other parsing functions (geomfromwkt, etc.) default to "1". This way, 
> you have to specifically set the parse level in order to cause trouble 
> loading - just enough rope to hang yourself. I admit it seems a bit 
> inconsistent, but that seems to be the price of maintaining forwards and 
> backwards compatibility - forwards for being able to load less-valid 
> geometries, and backwards for people having the expectation that 
> inserted geometries should "work" with whatever functions they currently 
> work with).
> 
> Chris


Hi Chris,

I must admit I was thinking along similar lines: how do you handle the 
case where you wish to restore a pg_dump file containing invalid geometries?

However, having thought about it longer, I think that your second idea 
actually better than anything I came up with. So in other words, have 
the _in and _out functions accept anything, but force OGC checking on 
the ST_GeomFromText() functions.

If this were how we were to move forward then there is still the 
possibility of getting bad data into the database using the direct 
'01001001 ... '::geometry cast. I'm not too worried about this because:


i) an external tool would have to generate direct COPY-style input 
without using the GeomFromText() or GeomFromWKB() functions; while 
currently this is HEXEWKB we have never guaranteed that the internal 
format will not change between versions to support new features.

ii) any PostGIS tools such as shp2pgsql/pgsql2shp wishing to use the 
above formats can use the liblwgeom parser to verify the geometries as 
they are output, and halt on error unless a --force flag is specified. 
Hence people using the --force options will be well aware that what they 
are putting in their database may not successfully be usable by the client.


I'm reasonably happy enough now to start work on this over the next 
couple of days, particularly with the grunt-work of moving 
shp2pgsql/pgsql2shp over to using liblwgeom. Please shout now if you see 
any obvious flaws in the above plan.


ATB,

Mark.

-- 
Mark Cave-Ayland
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk
T: +44 870 608 0063



More information about the postgis-devel mailing list