[gdal-dev] How do GCP's in a VRT file work?

Wed Nov 29 07:37:11 PST 2023

Joe Lovick via gdal-dev <gdal-dev at lists.osgeo.org> writes:

> Could someone explain how GCP's placed into a VRT file work, i was
> expecting a linear/bi-linear, or polynomial fit between the points for
> the simplest EPSG:4326 projection.however that is not what i see,
> adjusting a single point, effects the projection of all corner points.

(I am a newbie at VRT but find imagery registration interesting.)

Before I started reading docs, I had a different expectation from you,
which is that the input and output were conformal projections, and that
there was a single translate/rotate/scale transformation estimated,
probably via least squares, from all GCPs.

I looked at:

  https://gdal.org/drivers/raster/vrt.html

where it says

    GeoTransform: This element contains a six value affine
    geotransformation for the dataset, mapping between pixel/line
    coordinates and georeferenced coordinates. Typically
    (geotransform[0], geotransform[3]) will be the (easting, northing)
    of the upper-left corner of the raster, geotransform[1] the
    horizontal resolution in geospatial coordinates/pixel, and
    geotransform[5] the vertical resolution in geospatial
    coordinates/pixel, as a negative value if the image is north-up
    oriented. See Affine GeoTransform for more details about that
    mapping.

        <GeoTransform>440720.0,  60,  0.0,  3751320.0,  0.0, -60.0</GeoTransform>

    GCPList: This element contains a list of Ground Control Points for
    the dataset, mapping between pixel/line coordinates and
    georeferenced coordinates. The Projection attribute should contain
    the SRS of the georeferenced coordinates in the same format as the
    SRS element. The dataAxisToSRSAxisMapping attribute is the same as
    in the SRS element.

        <GCPList Projection="EPSG:4326">
            <GCP Id="1" Info="a" Pixel="0.5" Line="0.5" X="0.0" Y="0.0" Z="0.0" />
            <GCP Id="2" Info="b" Pixel="13.5" Line="23.5" X="1.0" Y="2.0" Z="0.0" />
        </GCPList>

which indicates that there are different scales for image-up anad
image-right, and leaving me confused about [2] and [4].  As I was
writing this, it came to me that they are the off-diagonal rotation
terms.I think this is a translation (0/3) and a matrix multiply from
pixel coords.

map_coords = [0  + [image_east, image_north] * [ 1 2
              3]                                 3 4]

is now my understanding.  So 2 and 3 are zero if the image's easting is
equal to the x axis of the projected coordinate system.

Thus, after reading, I still expect that GCPList is processed via least
squares to find a GeoTransform translation and
scale/rotation/slightly-more matrix.

The basic point is that this process is starting from a belief that the
imagery is consistent and needs only translation, scaling, and rotation.
That's not quite true, as a general matrix multiply is looser than a
rotation and scale, having 4 variables vs 2.