[Gdal-dev] Cyclic projection and error propagation

Mon Mar 22 11:34:12 EST 2004

Matt Wilkie wrote:
>   Repeated Projection and Error Propagation
> 
>     * How much error is involved in projection?
>     * How many projection cycles are "safe"?
>     * Which softwares have the lowest error propagation rate?
>     * Are the answers different for raster and vector data?
> 
> For example, one of our data providers' native storage format is UTM. 
> However they only distribute data in geographic decimal degrees. So when 
> they send data to us they project it first. Now of course when we 
> receive the data the first thing we do is reproject it into Albers to 
> match the rest of our data.
> 
> So before we do any analysis at all the data has been projected twice to 
> three different coordinate systems. Chances are that when *our* upstream 
> provider was compiling the data, they started with data from an existing 
> source, which was likely in a different coordinate system.

Matt,

I am sure there are formal descriptions of the information loss that can
occur depending on the characteristics of the reprojection/resampling
steps, but I don't have a strong enough theoretical background to point
them out.

However, the general answer is that every reprojection/resampling there
will be data degredation and the the "damage" will be affected by a number
of factors, including the resampling kernel used and the information
density of the image.

The higher the information density of your data, the more damage that
will result from a resampling step.  So, for instance, if you had DEM
data stored at 100 m resolution and it resulted from actually going out
in the field and sampling every 100m in rough terrain with a GPS then
you would suffer substantial loss of information to reproject at the
same resolution in another projection.  However, if you work with a 10m
DEM derived by some method from samplings at 100m, then a reprojection
step will cause relatively little data loss.  Of course, if you data
area was also quite uniform then the data loss would be relatively small.

The same issues would presumably apply to spectral image data.

When taking into account information density and loss, you should also
only worry about the information that matters to you.  If you have
airphotos with 100cm resolution that show quite a bit of detail of trees
and other fine features, but you only want to use the data for rough
mapping of transportation networks, then your features of interest
effectively have a resolution of a couple meters and some smearing of
tree features wouldn't matter much.

So, in general:
  o try to minimize the number of resampling steps.
  o consider working internally at a higher resolution than the source
    data.
  o don't worry too much if the source data is effectively low resolution
    anyways (as might be the case with many DEMs produced in remote areas).
  o don't worry too much if the source data is already much higher
    resolution than you need for your purposes.

Note, my comments are mostly based on common sense, and not a strong
theoretical background, or even much real experience in the production
environments.  So take it mostly as an "in my humble opinion".

>     * Proj4 <http://remotesensing.org/proj/> mailing list (appears to
>       have been spammed into oblivion)

The PROJ.4 list is active. It is mostly the archive that has been spammed
into oblivion as far as I can tell.  However, I didn't see your message there.

Best regards,
-- 
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up   | Frank Warmerdam, warmerdam at pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | Geospatial Programmer for Rent