[gdal-dev] downsampling geotiff with a low-pass filter

Sun Nov 18 02:49:02 PST 2018

On 17.11.18 13:46, Even Rouault wrote:
> 
> Thanks a lot for the in-depth analysis.

You liked that? I'll give you more, and I hope that I can get my points 
across. It will be another long post.

First, I'm not really a DSP *expert*, but I'm doing my best to share my 
limited understanding. But it so happens that I know a lot about b-splines:

https://bitbucket.org/kfj/vspline

>> The prefilter is a high-pass filter which
>> 'sharpens' the original signal by just so much that applying the
>> reconstruction filter will produce the original signal. The problem with
>> prefiltering is that it's done with an IIR filter and has theoretically
>> infinite support (usually it's enough to have a handful of support
>> either end, forgive my sloppy formulation), which makes margin treatment
>> and handling of no-data value a dicey issue. This is probably one reason
>> why prefiltering tends to be omitted.
> 
> So there would be a need for high-pass prefilter after a low-pass one, before
> applying the resampling function ?
> How would that high-pass prefilter would be implemented ? Do you have
> references for that ?
> Does that high-pass prefilter only applies to cubic bspline, or to other
> methods as welll ? currently the code logic is common for Bicubic (Catmull-
> Rom), Bicubic-BSpline and Lanczos. They only differ by the kernel radius and
> the weighting function applied.

I see that this is confusing, so I'll try and clarify.

1. low-pass filtering

Before downsampling, high frequencies have to be removed from the signal 
if they are present. This is a simple fact of digital signal processing. 
You have to look at the signal, see if high frequencies are present, and 
remove them from the signal before proceeding. This is done with a 
low-pass filter and can only be omitted if the signal has no relevant 
high-frequency content. And, as I said, if you have a good low-pass 
filter, you may apply it anyway: if there are no high frequencies in the 
signal, the low-pass filter should have no effect.

This is the theory. In reality, though, low-pass filters are never truly 
perfect, so applying them on principle will always degrade the 
low-frequency content as well. An ideal low-pass filter would completely 
remove all frequencies above a given limit (this part of the spectrum is 
called the stop band) and pass all other frequencies unchanged (this is 
called the pass band) - with no in-between (the in-between is called the 
transition band). Real filters differ from this ideal, so usually the 
transition band is not zero-length, stop-band attenuation is not 
perfect, and there is some attenuation in the pass band as well.

So how can we approach an ideal low-pass? One approach of digital 
filtering is convolution. It's well-understood, easy to implement and 
has some advantages, especially for specific classes of kernels. But 
even with quite large kernels, to get near-ideal filter characteristics 
is hard. Much more effective than convolution is the use of 'IIR' 
filters. They tend to give very strong stop-band attenuation and short 
transition bands. But they are hard to design for a given set of 
requirements - there are many approaches, but it's all DSP black magic, 
and most of it is beyond my horizon, so here we'd have to call for help 
from the real experts.

Let's assume we have settled on a specific filter and applied it to the 
signal - or we have found the signal has no significant high-frequency 
content so we have omitted the filter. Let's call what we have now the 
'clean' signal. We are now ready for the next step.

It's important to acknowledge that the initial low-pass filter is 
needed. gdal does not use a low-pass filter before downsampling and 
requires users to know what they are doing: they have to apply the 
filter themselves. But most users will be unaware of this fact and rely 
on gdal to 'do things right'. But it doesn't.

2. downsampling the 'clean' signal

There are simple cases of downsampling, where we needn't think much 
about how to proceed. These cases occur when the downsampling can be 
done by simply picking every other, every third etc. sample from the 
clean signal. Doing so is certain not to 'miss' part of the signal, 
since the only thing we could miss here are the high frequencies, which 
are by definition absent from the clean signal. So this subsampling is 
safe. This special type of subsampling is called decimation, and if one 
can apply it, it's great because it's simple and mathematically correct.

Most of the time, though, things are not so easy: The subsampling is by 
a non-integer factor, or the points at which we want to pick up the 
subsamples aren't precisely at the locations where the samples of the 
input signal are located (in gdal term, ULLR for input and output may 
differ). So what needs to be done? We need to *interpolate*. 
Interpolation is a guess about how the signal looks where we don't have 
data. We only have data at the sampling locations, in between, we're 
reduced to guesswork, but we can make informed guesses guided by 
mathematical criteria, accepting that each way of guessing may have 
advantages and disadvantages. gdal offers quite a few interpolation 
methods, starting with the very fast but signal-degrading 
nearest-neighbour 'interpolation' and proceeding via linear 
interpolation to more sophisticated techniques like b-splines.

What do we expect of an interpolator? It should reproduce the input when 
it is evaluated at the sampling locations. This is called the 
'interpolation criterion'. 'in-between', it's informed guesswork, so it 
depends on the specific interpolator. Obviously, nearest-neighbour 
'interpolation' fulfills the interpolation criterion, as does linear 
interpolation. b-splines also do, but you can't simply apply the 
b-spline evaluation to the clean signal itself. b-spline mathematics 
require that the input signal be 'prefiltered' to obtain a set of 
'coefficients' which are then fed to the evaluation process. This 
prefilter is a high-pass filter which has to be applied to the input 
signal before the b-spline can be evaluated.

Let' stick with the cubic b-spline for now, and do a sample evaluation 
at an integral position (so, at a sample loaction). Let the clean signal 
be ... 0 , 1 , 0 ... and we'll evaluate right where the '1' is.

The b-spline reconstruction kernel (or, the set of weights at integral 
positions) is 1/6 2/3 1/6. If we convolve, we get

0 * 1/6 + 1 * 2/3 + 0*1/6 = 2/3

You can see that the interpolation criterion is *not* fulfilled, because 
we'd expect to see '1' as the result. Why is this? because we haven't 
prefiltered. I'll not go into prefiltering now, but a *prefiltered* 
signal at this position is

  -0.464285714 1.732142857 -0.464285714

and, voila, we get

  -0.464285714 * 1/6 + 1.732142857 * 2/3 + -0.464285714 * 1/6 = 1

When the b-spline is used for interpolation, so when it's evaluated at 
locations which are not the sample locations, the maths are less simple; 
the set of weights to apply to the coefficients has to be calculated, 
which is what the gdal code you have pointed me to does (in 
GWKBSpline4Values; it produces four weights, three weights as in my 
example above are a special case for evaluating at integral positions).

But applying this set of weights to the unmodified clean signal will, as 
in my little example, *not* reproduce the clean signal. Only forming the 
weighted sum from the b-spline coefficients will do so. Using b-spline 
evaluation on the input signal without prefiltering is, strictly 
speaking, wrong.

So what happens if you do it nevertheless? Obviously you lose some of 
the signal. And what you lose is some high-frequency content. The 
prefiltering boosts high-frequency content to compensate for the effect, 
and the boosting is fine-tuned to *precisely* counteract it, so that the 
interpolation criterion is fulfilled. If we don't apply the prefilter (a 
high-pass filter) before using the b-spline evaluation formula (a 
low-pass filter), we effectively low-pass-filter the signal.

Back to our processing chain. We have the 'clean' signal which has been 
stripped of high frequency content in step 1 above. Now we want to use 
b-spline interpolation on it to get our target signal. So before we can 
use b-spline evaluation, we need to prefilter the clean signal. This is 
something we do to the clean signal, and it has to be done, because 
otherwise we can't correctly interpolate. And this is what is missing in 
gdal, where the prefiltering is not done. What gdal does do instead is 
applying a slight low-pass filter instead of precisely interpolating. 
This low-pass has nothing to to with the low-pass in step 1, it simply 
happens to be yet another effect which is also a low-pass. The effect is 
mild and oftentimes it won't be easy to spot - And oftentimes it's even 
desired. It can even mask the problems arising from omitting step 1 above.

3. correct procedure

So, to finish this off, if you want to do everything right, how do you 
do it?

- first apply a low-pass filter to the signal to get the 'clean' signal

- then interpolate over the clean signal

And, if you want to use b-spline interpolation over the clean signal, 
you have to 'prefilter' it before using the b-spline evaluation formula. 
Other interpolators do not require this extra step, but usually their 
results are not as good.

>>
>> 2.
>>   I'd encourage GDAL to consider
>> adopting this method to avoid the worst of downsampling errors that come
>> with '-r average'.
> 
> I somehow remember people mentionning this in the past too. (ah and "GDAL"
> would be an abstract concept, if not embodied by its contributors, like you
> potentially ;-) )
> 

I'd like to help, but rather by explaining things. I'd not really like 
to get involved in coding, as I have a lot of work on my hands already.

I do owe the gdal project a debt, though. I am creating extremely 
detailed GPS maps of the Piedmont from government data using free 
software only, and if it weren't for QGIS, gdal and GRASS, my work would 
be very difficult indeed.

>>
>> 3.
>>
>> It's also possible to apply a low-pass filter no matter
>> if there is high-frequency content: if there is no high-frequency
>> content, a good low-pass filter has no effect.
> 
> Would probably be faster from a computation point of view to just apply the
> filter than prior checking if it is needed. One point to have in mind is that
> for memory reasons, and also because users also want to process big images by
> chunks, GDAL operates by chunks and not on the whole image, so "infinite
> support" is at best limited to the source window being passed to the function.

Jukka's hint to use kernel-filtered sources in a VRT file does do the 
trick and shows the way to go, but requires an intermediate file of the 
same size as the initial input. The maths is already there. Doing the 
step-1 low-pass this way is a reasonable choice, even if IIR filters may 
perform better. If the process is done in chunks, you need to cut 
overlapping chunks. The overlap has to be large enough to give support 
for the calculation of the pixels at the edge of the actual tiles. When 
using convolution, the size of the overlap has to be half the kernel 
size, which is not too much. With IIR filters, you can choose how much 
error you find acceptable and select the overlap size accordingly.

> 
>> What has to be understood here is that the filter has to match the
>> desired scale change. The kernel has to be produced by a process taking
>> the scale change into account, which can be done for a gaussian by
>> choosing an appropriate standard deviation.
> 
> For a down-to-earth approach, any recommended formulas to apply ? Does it
> depend only on the scaling factor ?

Theoretically, it does depend only on the scaling factor. But as I have 
outlined above, digital filters are not perfect. They have transition 
bands, pass-band and stop-band ripple - so, in reality, it depends. I 
know this is a frustrating answer, but this is a fact of life. In the 
real world, you have to compromise and find a solution which is 'good 
enough' for what you have in mind.

When using gaussians for low-pass filtering, there is a rule of thumb, 
but just now I can't seem to find it, sorry. I'll try and come up with 
the rule of thumb, which gives a standard deviation which will remove a 
'significant portion' of frequencies above a given limit. Given that 
standard deviation, you can truncate the gaussian and then normalize it. 
But there are many different low-pass filters, each with it's own 
special characteristics, and gaussians aren't necessarily the ones which 
are best suited for all tasks.

Don't you have a DSP specialist in your community? Would be nice to have 
someone else joining the discussion. My DSP knowledge is mainly through 
image processing, so I may miss fine points with geodata.

> 
>>
>> Still with me? Thanks for your patience! I hope I have understood all
>> the source code and response posts correctly, don't hesitate to correct me.
> 
> While we are at assessing correctness of the implementation, one point that
> has sometimes been raised is the potential impact of the "pixel-is-area"
> versus "pixel-is-point" convention. Typically an imaging sensor will reflect
> the number of photons collected over an area in a range of wave-length, so the
> value of the pixel applies to the whole area it covers. Other datasets might
> be the result of a variable evaluated at the center of the pixel. GDAL
> currently computes the distances to the pixel centre when computing the
> weights of the resampling function, so it would seem to me that it is the
> natural way for "pixel-is-point" datasets, and I'm wondering how good it is
> for "pixel-is-area" ones.

We could fill a whole new thread discussing this ;)

First the sensor. When it comes to images, you are right: the sensor has 
an area. But it's a smallish area, definitely smaller than the area a 
pixel 'occupies' later on (because the sensor chip has other stuff on 
it's surface), and the incoming signal is already low-pass filtered by 
the lambda filter, and then the light goes through a color filter to 
produce a bayer pattern which only produces 'pixel' values after a very 
complex process of demosaicking. All of this is so complex that it's 
more of an art than a science. But if you accept the set of pixels as a 
starting point, the pixel-as-area metaphor is usable just as the 
pixel-as-point metaphor. If you use pixel-as-area and you subsample, 
picking precisely the area which a target pixel would occupy in the 
source image is a reasonable choice and much better than simply 
averaging over all pixels-as-points which are nearer than a given 
maximum distance. Now what you do with the data in this area is a 
different matter. Averaging over the area is one possibility, but you 
might do a weighted sum to get better results. Again, it's a matter of 
what you want to achieve and what errors you can accept. There is no 
silver bullet. It's a bit like the duality of light. Is it a wave or a 
particle?

And when it comes to DEM data derived from LIDAR scans, which is what I 
am currently working on, the pixel-as-area approach is very valid: the 
LIDAR scan produces a point cloud, which is submitted to binning, so 
that all measurements happening to occur in a given small rectangular 
area are put together and averaged. Interpreting such a bin as something 
which has an area over which the signal is uniform is, again, an 
understandable choice, but it's not necessary. You might, for example, 
use weights on the point measurements when binning them. Or use a median 
filter or a specific quartile or or or to get data for a specific 
purpose. It's important that you understand what you're doing and pick 
your tools accordingly. And this is much easier said than done.

On top of that, using other people's data, you often don't precisely 
know how they have already 'messed' with the data. pixel-as-area is one 
way which is often quite close to a good choice. But it's important to 
understand that 'area' interpolation is, as every other method, a 
digital filter and produces certain errors. If that is okay with what 
you want to achieve, it's okay to use it.

So what's the upshot? If you are writing software for a large audience 
of people who mostly don't understand the fine points, you want to 
shield them from the worst mistakes by making it hard for them to make 
these mistakes. An automatic low-pass filter before downsampling is just 
such a shield. And using area interpolation instead of the average over 
contributing pixels would be another one. So having code in place to 
perform these tasks is good, making them the default is sensible, and 
allowing users to overrule the defaults is courteous to advanced users, 
whom you don't want to put into a straightjacket of what you think is 
'right'. And when it comes to b-splines, when they are used without 
prefiltering the data, this should be said in the documentation, because 
then it's no longer strictly speaking an interpolation but produces mild 
blur.

Kay