[postgis-devel] [PostGIS] #944: typmod support for PostGIS geometry

Paragon Corporation lr at pcorp.us
Mon May 23 17:37:59 PDT 2011


Steve,
* Paragon Corporation (lr at pcorp.us) wrote:
>> The reason why it doesn't need fixing and is the way it should be is for
>> most use cases I rely on these catches.  Note most use cases, I need my
>> types to be exactly the same (and exactly the same I mean unconstrained
>> exactly the same in some cases).  For postgres to not enforce this I
would
>> consider too much ambiguous guessing of what my intent was.

>I'm not sure that I really see it as leading to any actually undefined
>or unintuitive behavior..
Here is hopefully a clearer example of what I mean by ambiguous behavior
using your now fixed bug:

Let's suppose I created all my wonderfull tables like this
test_parent geom geometry
  geom geometry(point)
  geom geometry(polygon)

And I realize, hey wait a minute -- I forgot to define a spatial reference
system which I want all my tables to have.
The beauty of inherited tables is I just need to set my change at the parent
table and my changes would drill down to 
all the children which in most cases (except for those nasty situations of
going from varchar(100) to varchar(300) or something like that), works
wonderfully.

But now PostgreSQL is faced with an ambiguous question? "Does Regina want to
change all
tables such that the geometry type is relaxed and srid is added or does she
simply want to change
the srid?"

When I do a command such as
	ALTER TABLE test_parent ALTER COLUMN geom TYPE geometry(26986) USING
ST_SetSRID(geom,26986);


In the new world what would I do?
 given PostgreSQL current (what you define as a buggy behavior):
Well I would have created my parent table with just geometry --
	My other child tables would have been created with
AddGeometryColumn... (perhaps with an additional switch we define that lets
PostGIS know I want the old constraint type behavior instead of typmod
behavior)

Then when I realized my mistake -- I would perhaps do this:
	ALTER TABLE test_parent ALTER COLUMN geom TYPE geometry(26986) USING
ST_SetSRID(geom,26986); 

or this
	SELECT UpdateGeometrySRID('test_parent', 26986);

NOTE THAT: both ways would work currently (but not with your fixed bug)
because check constraints are inheritable so if they were set at the parent
they would be
dropped if you drop from the parent level which is all UpdateGeometySRID
knows about and if I had all my things
as untyped -- PostGIS would either alter the geometry type of all tables or
add a constraint (dropping whatever SRID constraint existed)


>> So when I try add a table in an inheritance hierarchy it saves me when I
>> screwed up on defining a type.  So I consider the check a necessary evil.

> If you have a well-defined type+typmod on the parent table, it'd work
> the same way as you expect.  The only case where this would be different
> is if you had the same *type* but a less-constrainted typmod on the
> parent then you could have a more-constrainted typmod on the child.
NO it would not -- see example above.

>> Why is looking at constraints bad?  They are stored in the catalogs as
well
>> and not all that much harder to parse out especially since we name them
>> consistently (since these cases they were added with addgeometrycolumn)

> This is more-or-less akin to asking that you structure things like:
> NUMERIC mynum
> CHECK (floor(mynum) = mynum)

> and then asking PG to represent that column in the database catalogs as
> being 'NUMERIC(18,0)' or something like that.  We should not be using
> CHECK constraints as the basis of what the typmod on a column is.
> You're welcome to define the column as geometry and put whatever CHECK
> constraints you want on it, but don't go changing what that ends up
> looking like in the (more-or-less) catalog.

NO it is NOT,  for the basic reason that geometry constraints (srid, type,
dimension are special
constraints) PostGIS put in and have always been used as if they were pseudo
type modifiers.  Most everyone defines them with the tools we provide if
they care about third party tools doing the right thing, so we can predict
exactly what they should
look like (and if they don't look like that, then people can  run
populate_geometry_columns to make them look 
that way).  The above you have is not predictable and that is why it's not
used because if you look at the
whole scope of what is allowed in check constraints there are an infinite
number, but in our case we have only 3 (possibly six if you
consider older geometries were defined with deprecated functions that we
care about and they are named the same).


>  What happens when you drop
> and recreate that constraint for some reason, and someone else looks at
> the column in the middle of that operation?

I fail to see how this is different from doing an ALTER column operation?
If the change requires updating the data, the table records would be locked
during update in ALTER column
In check contraint case (if it changes the underlying data I would drop the
constraint, update my data, and readd the constraint
- the data would never be in an inconsistent state, if it doesn't change it,
it would do only a read lock and it wouldn't matter
if you were going by old or new constraint).

Besides really how often does this even happen for it to be a concern?

> That doesn't make too much sense  to me for PostgreSQL to do that (unless
I
> misunderstood what you meant).   Take the example of ST_Intersection where
> what comes out can be who the heck knows except you.

> That's more-or-less exactly the case which would be nice to handle, but
> I don't see it happens.  It's not like we don't know when returning from
> ST_intersection what the geometry of the result is.  If we didn't, how
> in the world could we sensibly return *anything*?  
Huh -- in many cases we don't know what type it is (it could be a
multipolygon when two polygons intersect etc).  So the best we can do is
ST_Intersection(....)::geometry(26986,2);
Which is a tad bit better than just knowing it's a geometry. In some cases
its mixed types.

However if I did something like this in a view

CREATE VIEW ...
	AS SELECT (ST_Dump(geom)).geom::geometry(POLYGON,26986,2)
		FROM bunch_of_mpolygons_polygons;

See the difference -- the user knows what's possible and PostgreSQL does not
without taking PostGIS 301.

> We don't know going
> in, of course, and right now PG *has* to know going in, but who knows,
> maybe that'll change some day.

> We already have *some* support for this kind of notion, through the
> ANYARRAY type.
Again not really to my knowledge -- you can use ANYARRAY as an input
argument, but not as an output type result.

So I can do somefunc(ANYARRAY) -> float  but it would never be
somefunc(ANYARRAY) -> ANYARRAY I punched in

Which is exactly what we have in PostGIS anyway

somefunc(any geomety) -> float or anygeometry which you can cast as you see
fit.
So actually in fact we have more flexibility with PostGIS since we would be
returning the same base type.
similar to 

always returing varchar instead of varchar(60)

What we don't have is typmod input checking which as I recall seeing in the
change log  I think Tom Lane is working on which would be useful for our
linestring only functions and so forth.

>> In fact pretty many of the functions we have are like that.  E.g
>> ST_Transform -- what comes out is the same geometry type that went in but
we
>> can't say ST_Transform returns POINT or whatever.

> Right, exactly, we have lots of situations like this and we have to try
> to cast them all today to what the caller *thinks* the answer is.  If
> the caller ends up trying to force a cast which doesn't match what
> st_intersection ends up returning, they're likely to get a suprising
> error..
Yap which I happen to think is WONDERFUL.  Think about it. I define a 
nice view which I predict should always return geometry(POINT,26986,2)

I run my view, it breaks, -- I know some EVIL person put something in my
table that they shouldn't 
have -- I go hunt down EVIL person and shoot them.

Or I could be wrong -- in which case I get a phone call from a troubled user
that things are not loading.
I look Ah view failed.  FIX, APOLOGIZE.  Works itself out wonderfully -- I
don't have some weird code that is only 
half working giving logical errors that are hard to troubleshoot and track
down.  I have a big swinging error that is screaming for help. I can deal
with that.  I can't deal with faint logical errors. 

My philosophy - when things don't work the way you require them to, they
should just BLOW UP in your face so you know and so you can TERMINATE WITH
EXTREME PREJUDICE :). 
Not any whimpering error where you got to go check your logs and the app
just happily ignores it.

>> For some we could like ST_Centroid.

> Not really..  There's a column issues here, based on the docs:
> the SRID matches that which goes in and couldn't be defined for the
> entire function, and it can return an empty geometrycollection.  Now, I
> don't really see why an empty geometrycollection should be returned, as
> I would think NULL or an error would be more appropriate, but if we want
> to exactly match what the current behavior is we'd have to deal with
> that.

> In PostGIS 2.0 we support typed empties so you could cast an empty
geometry collection validly to 
an empty point. I'll be able to try that once Paul makes his typmod change
instead of talking
about it in theoretical terms HINT HINT.

  So with that said -- I stand by ST_Centroid could always return a POINT or
NULL
in PostGIS 2.0 world at any rate, but we have the dangling issue that SRID
is not constant and dimension is possibly not constant
either which is hard to get around.

But in many cases if I get back a type I don't expect 
as my favorite Apocaplyse Now quote says - it should  "Be Terminated with
extreme prejudice." 

Typmod allows me to, if you will, have the equivalent of programming asserts
in my views.  I consider that a very good thing.


> Or that it always returns a specific type regardless of what went in.  

> Well, this we can do to some extent...  I would have expected that you
> could specify the typmod on a return type for a function, but I have to
> admit that I don't ever recall trying to.

The more I look at the functions we have in place, the less I think this
feature would actually be useful.
Because we need to be able to define wacky things like type, dimension, and
srid (that are not exact -- e.g. has to be same as arg 2
or has to be POLYGON when if this that)
I really think it's best that the user has to deal with it on a case by case
basis.  If they don't know enough
well, they'll learn or live with not being able to use views in QGIS. Which
is pretty much what we have today anyway
except instead of putting my assertions in a table that I need to remember
to update everytime I make a change to my view
and can get out of synch with my view, I define it as part of the definition
of my view (just one place to edit)
which is my golden dream and I think many people's as well.


Thanks,
Regina
http://www.postgis.us






More information about the postgis-devel mailing list