[gdal-dev] downsampling geotiff with a low-pass filter
Kay F. Jahnke
_kfj at yahoo.com
Wed Nov 21 01:22:36 PST 2018
On 16.11.18 13:13, jratike80 wrote:
>
> If a low-pass filter is needed before passing data to downsampling, I wonder
> if it could be done by utilizing a VRT with KernelFilteredSource
> https://www.gdal.org/gdal_vrttut.html. May well be that it is not relevant
> at all, my understanding of mathematics coming from agricultural background.
Jukka, thanks for your hint. Actually this is a very good idea, and I
have tried applying it to my data, but I have some issues, so I thought
I'd report back with my findings.
Using convolution for a low-pass filter is not the most efficient way,
but it's not hard to set up and does work okay. To get a suitable
filter, an easy approach is to use 'binomial filters'. They can be
derived from Pascal's triangle (see
https://en.wikipedia.org/wiki/Pascal%27s_triangle) by taking a row and
dividing it by the sum of all numbers in that row. You want to pick a
row with an odd number of numbers, like 1 2 1 or 1 4 6 4 1, so you can
pick all odd rows. The rule of thumb is that 1 2 1 gives you an
approximation of a gaussian with standard deviation 0.5, and with every
two rows you go down, the standard deviation increases by 0.5. And more
rule of thumb tells you to remove a 'significant' portion of the
high-frequency part of a signal, you want a standard deviation of 1. In
fact the method is used to create 'image pyramids', where an image is
low-pass-filtered and then scaled down, which is pretty much what I need
to do with my DEM data. I won't go into the mathematical details, but
rather, I'll describe what precisely I did to my concrete use case of
scaling down from 2m resolution to 5m resolution, and if anyone has good
suggestions or criticism, please come forward!
1. Finding the right convolution kernel
Since all of this is a bit rule-of-thumb, I have decided to use a
binomial filter with a standard deviation of ca. 1.5. So I picked the
seventh row of pascal's triangle:
1 6 15 20 15 6 1
which I divided by 64, the sum of these numbers, obtaining
.015625 .09375 .234375 .3125 .234375 .09375 .015625
This would have been right for the second variant of a kernel-filtered
source in the wiki, but I don't have GDAL 2.3 yet and I don't want to
mess with my system, so I formed the cross product and came up with this
argument in the VRT file:
<Kernel normalized="1">
<Size>7</Size>
<Coefs>0.000244141 0.00146484 0.00366211 0.00488281
0.00366211 0.00146484 0.000244141 0.00146484
0.00878906 0.0219727 0.0292969 0.0219727
0.00878906 0.00146484 0.00366211 0.0219727
0.0549316 0.0732422 0.0549316 0.0219727
0.00366211 0.00488281 0.0292969 0.0732422
0.0976562 0.0732422 0.0292969 0.00488281
0.00366211 0.0219727 0.0549316 0.0732422
0.0549316 0.0219727 0.00366211 0.00146484
0.00878906 0.0219727 0.0292969 0.0219727
0.00878906 0.00146484 0.000244141 0.00146484
0.00366211 0.00488281 0.00366211 0.00146484
0.000244141
</Coefs>
</Kernel>
This does work on the original data; my understanding of the docu where
it said it was 'not applied to sub-sampled or over-sampled data' was
probably a misunderstanding on my part - I think it refers to the VRT
file's source files, not to it's output, which is what I want to subsample.
2. subsampling
I decided to perform the subsampling with -r cubicspline, which, as the
prefilter is missing, does not interpolate precisely, but it only does a
little extra smoothing, which I don't mind since I want contour lines
only. So with the VRT file at hand, I did
gdalwarp -tr 5 5 -r cubicspline in.vrt out.tif
3. looking at the result
The resulting data are indeed nicely low-pass-filtered, but I do have a
problem here: My source data are regional DEMs, and such data are often
done in such a way that they end at the borders of whatever
administrative entity they are given for. To be concrete, I was working
on the 2m DEM of the Valle d' Aosta, and this is cut into 914 little
tiles, where the tiles on the border are cut off at the borders of the
autonomous region, the remainder of the tile being filled with a
no-data-value. gdalwarping the input data seems to result, internally,
in replacing the no-data part with zero, then filtering zeros and
nonzeros alike. The result is a 'gray zone' around the borders, where
the convolution picks up both zeros from the 'unknown territory' and
valid data. This is a problem, because it means that a margin of six
pixels is now wrong (because it has been 'tainted' with the zeroes
outside). I'd have to cut that off - but I have no easy way of doing so.
The most straightforward solution I have come up with is flattening the
data, feeding them to an image processing software, doing several
erosions, and using the result as a mask. That's 'okay' for a one-off,
but really, I'd like to see better treatment of no-data values. I
wouldn't mind if the results around the edges were slightly less precise
than further inside. I think the kernel-filtered source could be
interpreted like this:
If the target point does coincide with only no-data-values in the
source, produce a no-data-value
Otherwise, if it's source pixels (those used in convolution) are partly
no-data, take zero instead of no-data, but re-normalize the kernel.
So in 1D, if we have a kernel of (1/4 1/2 1/4) and participating source
data of (no-data 3 4), we'd use a kernel of (1/3 2/3 1/3) - the
re-normalization is done by first multiplying by four, then dividing by
three.
We get 0 * 1/3 + 3 * 2/3 + 4 * 1/3 as the result, effectively ignoring
the no-data value but 'pulling up' the result.
Maybe this can be done anyway, but so far I haven't found parameters to
do so.
The downsampling step seems to do such calculations automatically - the
margin of the valid data does not look suspicious when downsampling the
original data without first applying the low-pass. So the problematic
step is the application of the kernel near the margin of the valid data
- maybe this is already fixed in GDAL 2.3?
So much for my use of a kernel-filtered source in a VRT file, comments
are of course welcome.
Kay
More information about the gdal-dev
mailing list