[postgis] ST_TRANSFORM: proposal for postgis projectiontranformation

Wed Dec 19 11:10:27 PST 2001

Frank Warmerdam wrote:
> First, I would like to suggest calling GDAL/OGR functions for doing the
> WKT->Proj.4 translation as an option.  This would avoid the need to have
> the proj4text column populated in advance at the cost of loading a
> substantially larger shared library (libgdal.1.1.so) in addition to the
> PROJ.4 one.  I am not sure how big an issue that is.  Note, there are
> C linkage entry points for OGRSpatialReference, and OGRCoordinateTransform.
> Once benefit of "buying into" OGR is that we can also piggyback on it's
> knowledge of how to translate between other coordinate system representations
> like EPSG codes,OGC XML, ESRI .prj format and so forth.  Of course, it might
> just be better to use an OGR based tool to help populate the SRID table
> and keep the direct dependencies of PostGIS minimal.

I hear what you're saying, and to a large extent I agree.  But, I dont 
really want to stick the entire GDAL distribution into
PostGIS/PostgreSQL.
On the other hand, having GDAL/OGR support would add a lot of
functionality 
and benefits to the larger Open Source GIS community.  
It would be nice to say to GDAL/OGR "Hay, I've got this projection, data
in
this location, and this new projection.  What do I do?"  Or, even
better,
"Here's some geometry and 2 srids, give me a new geometry."
I'm all for having GDAL/ORG doing working for me.  If all I have to
maintain
is the transform_geom(GEOMETRY,proj4text_source, proj4text_destination)
function, I'd be a happy boy.  Even better would be
tranform_geom(GEOMETRY,
PJ *proj_source, PJ *proj_dest).

> Second, an issue that isn't really adequately addressed in general by
> the SF_TRANSFORM(shape,newSRID) mechanism is that there could be more than
> one transformation that can take you from the source coordinate system
> to the destination one.  This isn't a huge issue in the short term, but
> will be an issue in the longer term with the need to match up with the
> OGC Coordinate Transformations model for coordinate transforms.
> 
> Examples of coordinate system pairs with multiple transformations is stuff
> like should the NTv1 or NTv2 datum shift tables be used.  What locally
> optimized 3 parameter transform should be used for a mapping between two
> datums.

Yes, these are things I dont really want to worry about - it would be 
best if I could just ask GDAL/OGR what to do.  But, one could still do
this
under the proposed system by creating 2 specific SRIDs, doing a NULL 
transform from the generic projection to the specific projection, then
doing
a real transform from the specific source SRID to the specific
destination
SRID, and another NULL transform from the specific destination SRID to
generic SRID.

ie. 
	Create 2 new rows in the spatial_ref_sys table with the
	   specific projection information (ie. what 3 or 7 parameter
	   datum shift to use).

	setSRID(
		TRANSFORM(
			setSRID(geometry, <SRID_specific_source>)
			,<SRID_specific_destination>
			)
		,<SRID_generic_desination>
		)

Yes, its ugly and difficult to understand, but it should work.

What are other system using to account for these multiple transformation
problems?

> Third, I would be willing to produce a table mapping all easily translatable
> 2D EPSG coordinate systems to WKT and PROJ.4 definitions.  Poke me when the
> need for these becomes pressing if I haven't already produced.

Okay.  Someone (maybe Paul Ramsey) has a big list of WKT projections and
their
corresponding EPSG codes.  I think proj4 ships with a big list of EPSG
codes
and proj4 strings.

> Fourth, I would advise against direct serialization of projPJ structures.
> The PROJ.4 definition string is pretty much the serialized form, and shouldn't
> be terribly expensive to translate back into a structure.  However, for
> reprojecting whole tables, it would be really desirable to be able to
> cache the projPJ handles for a little while.  Is there no way to cache them
> in a hash table and clean them up later when a transaction is complete?

Postgresql has a high degree of isolation between action that take place
between rows. The easiest way convey partial results is to use cacheable
functions.  I believe functions are cached only during the execution of 
a single statement (its subject to change in future releases - some
people
are advocating caching over an entire transaction).

A function is cacheable if it has no side effects, and the result is
always the
same if the function's arguments are the same.

Here's an example of "cheating" around the row isolation. In the example
below, I dont want to re-parse the WKT spheroid definition for every
row.

SELECT
length_spheroid(geometry_column,'SPHEROID["GRS_1980",6378137,298.257222101]')
 from geometry_table;

This example looks like it would parse the WKT for every row.  In fact,
it doesnt.

The length_spheroid() function is defined as:

CREATE FUNCTION length_spheroid(GEOMETRY,SPHEROID)
   RETURNS FLOAT8
   AS
'/data1/Refractions/Projects/PostGIS/work_dave/postgis/libpostgis.so.0.6','length_ellipsoid'
             LANGUAGE 'c'  with (isstrict);

Notice that the 2nd argument is of type SPHEROID not "text".

So, when postgresql sees a call to the length_spheroid(geometry, text)
is looks for
a function with types length_spheroid(geometry, text), and cannot find
it.  It finds the 
function length_spheroid(geometry, spheroid) and says to itself "can I
convert text to
spheroid?".  It then looks for a function SPHEROID(TEXT) - and finds
it.  

The WKT string is then converted to the datatype "SPHEROID" and that
conversion is cached.  
length_spheroid(geometry,spheroid) is called.

On the next row, it will used the cached version of SPHEROID(TEXT).

At least thats what I think it does.

In our case, transform(geometry,integer), is much more complex.
Eventually, we want to convert this to tranform(geometry, PJ*,PJ*) but
there
is no automatic way of changing the number of arguments to a function.

I'm probably going to have tranform(geometry,integer) be another
function 
(language: plpsql or sql) that will do something simple like:

   transform(geometry, integer) :-
	transform($1, get_proj4_from_srid( getSRID($1) ),
get_proj4_from_srid($2) )

the getSRID(geometry) will (probably) always produce the same SRID. 
That means
that the 1st get_proj4_from_srid() will only get called once since it
could cache
its results.  The 2nd get_proj4_from_srid() would, likewise, only get
called once.

This leaves us with what does get_proj4_from_srid() return.  If it
returns a 
string, tranform(geom,text,text) would have to parse the proj4
definition string
for each row.
On the other hand, if we create a new type (say PROJ4_PJ) thats a
serialized form
of the PJ* structure, we can have get_proj4_from_srid() return a
PROJ4_PJ and
have tranform(geometry, PROJ4_PJ, PROJ4_PJ) de-serialize it (instead of
parsing a string).  

There maybe other fancy-pants ways of storing partial results (that dont
move around
in memory) during a single SQL statement, but they would be pretty
magical and involve
using the postgresql shared memory heap.  Not something I really want to
do.

Hope I'm making sense,

dave

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Call any Phone in the World from your PC with CrystalVoice
-LOW rates world-wide - $0.039/min in U.S.
FREE trial. Click here.
http://us.click.yahoo.com/Ib1xVB/IxbDAA/ySSFAA/PhFolB/TM
---------------------------------------------------------------------~->

To unsubscribe from this group, send an email to:
postgis-unsubscribe at yahoogroups.com

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/