[gdal-dev] downsampling geotiff with a low-pass filter
Kay F. Jahnke
_kfj at yahoo.com
Sun Nov 18 02:49:02 PST 2018
On 17.11.18 13:46, Even Rouault wrote:
>
> Thanks a lot for the in-depth analysis.
You liked that? I'll give you more, and I hope that I can get my points
across. It will be another long post.
First, I'm not really a DSP *expert*, but I'm doing my best to share my
limited understanding. But it so happens that I know a lot about b-splines:
https://bitbucket.org/kfj/vspline
>> The prefilter is a high-pass filter which
>> 'sharpens' the original signal by just so much that applying the
>> reconstruction filter will produce the original signal. The problem with
>> prefiltering is that it's done with an IIR filter and has theoretically
>> infinite support (usually it's enough to have a handful of support
>> either end, forgive my sloppy formulation), which makes margin treatment
>> and handling of no-data value a dicey issue. This is probably one reason
>> why prefiltering tends to be omitted.
>
> So there would be a need for high-pass prefilter after a low-pass one, before
> applying the resampling function ?
> How would that high-pass prefilter would be implemented ? Do you have
> references for that ?
> Does that high-pass prefilter only applies to cubic bspline, or to other
> methods as welll ? currently the code logic is common for Bicubic (Catmull-
> Rom), Bicubic-BSpline and Lanczos. They only differ by the kernel radius and
> the weighting function applied.
I see that this is confusing, so I'll try and clarify.
1. low-pass filtering
Before downsampling, high frequencies have to be removed from the signal
if they are present. This is a simple fact of digital signal processing.
You have to look at the signal, see if high frequencies are present, and
remove them from the signal before proceeding. This is done with a
low-pass filter and can only be omitted if the signal has no relevant
high-frequency content. And, as I said, if you have a good low-pass
filter, you may apply it anyway: if there are no high frequencies in the
signal, the low-pass filter should have no effect.
This is the theory. In reality, though, low-pass filters are never truly
perfect, so applying them on principle will always degrade the
low-frequency content as well. An ideal low-pass filter would completely
remove all frequencies above a given limit (this part of the spectrum is
called the stop band) and pass all other frequencies unchanged (this is
called the pass band) - with no in-between (the in-between is called the
transition band). Real filters differ from this ideal, so usually the
transition band is not zero-length, stop-band attenuation is not
perfect, and there is some attenuation in the pass band as well.
So how can we approach an ideal low-pass? One approach of digital
filtering is convolution. It's well-understood, easy to implement and
has some advantages, especially for specific classes of kernels. But
even with quite large kernels, to get near-ideal filter characteristics
is hard. Much more effective than convolution is the use of 'IIR'
filters. They tend to give very strong stop-band attenuation and short
transition bands. But they are hard to design for a given set of
requirements - there are many approaches, but it's all DSP black magic,
and most of it is beyond my horizon, so here we'd have to call for help
from the real experts.
Let's assume we have settled on a specific filter and applied it to the
signal - or we have found the signal has no significant high-frequency
content so we have omitted the filter. Let's call what we have now the
'clean' signal. We are now ready for the next step.
It's important to acknowledge that the initial low-pass filter is
needed. gdal does not use a low-pass filter before downsampling and
requires users to know what they are doing: they have to apply the
filter themselves. But most users will be unaware of this fact and rely
on gdal to 'do things right'. But it doesn't.
2. downsampling the 'clean' signal
There are simple cases of downsampling, where we needn't think much
about how to proceed. These cases occur when the downsampling can be
done by simply picking every other, every third etc. sample from the
clean signal. Doing so is certain not to 'miss' part of the signal,
since the only thing we could miss here are the high frequencies, which
are by definition absent from the clean signal. So this subsampling is
safe. This special type of subsampling is called decimation, and if one
can apply it, it's great because it's simple and mathematically correct.
Most of the time, though, things are not so easy: The subsampling is by
a non-integer factor, or the points at which we want to pick up the
subsamples aren't precisely at the locations where the samples of the
input signal are located (in gdal term, ULLR for input and output may
differ). So what needs to be done? We need to *interpolate*.
Interpolation is a guess about how the signal looks where we don't have
data. We only have data at the sampling locations, in between, we're
reduced to guesswork, but we can make informed guesses guided by
mathematical criteria, accepting that each way of guessing may have
advantages and disadvantages. gdal offers quite a few interpolation
methods, starting with the very fast but signal-degrading
nearest-neighbour 'interpolation' and proceeding via linear
interpolation to more sophisticated techniques like b-splines.
What do we expect of an interpolator? It should reproduce the input when
it is evaluated at the sampling locations. This is called the
'interpolation criterion'. 'in-between', it's informed guesswork, so it
depends on the specific interpolator. Obviously, nearest-neighbour
'interpolation' fulfills the interpolation criterion, as does linear
interpolation. b-splines also do, but you can't simply apply the
b-spline evaluation to the clean signal itself. b-spline mathematics
require that the input signal be 'prefiltered' to obtain a set of
'coefficients' which are then fed to the evaluation process. This
prefilter is a high-pass filter which has to be applied to the input
signal before the b-spline can be evaluated.
Let' stick with the cubic b-spline for now, and do a sample evaluation
at an integral position (so, at a sample loaction). Let the clean signal
be ... 0 , 1 , 0 ... and we'll evaluate right where the '1' is.
The b-spline reconstruction kernel (or, the set of weights at integral
positions) is 1/6 2/3 1/6. If we convolve, we get
0 * 1/6 + 1 * 2/3 + 0*1/6 = 2/3
You can see that the interpolation criterion is *not* fulfilled, because
we'd expect to see '1' as the result. Why is this? because we haven't
prefiltered. I'll not go into prefiltering now, but a *prefiltered*
signal at this position is
-0.464285714 1.732142857 -0.464285714
and, voila, we get
-0.464285714 * 1/6 + 1.732142857 * 2/3 + -0.464285714 * 1/6 = 1
When the b-spline is used for interpolation, so when it's evaluated at
locations which are not the sample locations, the maths are less simple;
the set of weights to apply to the coefficients has to be calculated,
which is what the gdal code you have pointed me to does (in
GWKBSpline4Values; it produces four weights, three weights as in my
example above are a special case for evaluating at integral positions).
But applying this set of weights to the unmodified clean signal will, as
in my little example, *not* reproduce the clean signal. Only forming the
weighted sum from the b-spline coefficients will do so. Using b-spline
evaluation on the input signal without prefiltering is, strictly
speaking, wrong.
So what happens if you do it nevertheless? Obviously you lose some of
the signal. And what you lose is some high-frequency content. The
prefiltering boosts high-frequency content to compensate for the effect,
and the boosting is fine-tuned to *precisely* counteract it, so that the
interpolation criterion is fulfilled. If we don't apply the prefilter (a
high-pass filter) before using the b-spline evaluation formula (a
low-pass filter), we effectively low-pass-filter the signal.
Back to our processing chain. We have the 'clean' signal which has been
stripped of high frequency content in step 1 above. Now we want to use
b-spline interpolation on it to get our target signal. So before we can
use b-spline evaluation, we need to prefilter the clean signal. This is
something we do to the clean signal, and it has to be done, because
otherwise we can't correctly interpolate. And this is what is missing in
gdal, where the prefiltering is not done. What gdal does do instead is
applying a slight low-pass filter instead of precisely interpolating.
This low-pass has nothing to to with the low-pass in step 1, it simply
happens to be yet another effect which is also a low-pass. The effect is
mild and oftentimes it won't be easy to spot - And oftentimes it's even
desired. It can even mask the problems arising from omitting step 1 above.
3. correct procedure
So, to finish this off, if you want to do everything right, how do you
do it?
- first apply a low-pass filter to the signal to get the 'clean' signal
- then interpolate over the clean signal
And, if you want to use b-spline interpolation over the clean signal,
you have to 'prefilter' it before using the b-spline evaluation formula.
Other interpolators do not require this extra step, but usually their
results are not as good.
>>
>> 2.
>> I'd encourage GDAL to consider
>> adopting this method to avoid the worst of downsampling errors that come
>> with '-r average'.
>
> I somehow remember people mentionning this in the past too. (ah and "GDAL"
> would be an abstract concept, if not embodied by its contributors, like you
> potentially ;-) )
>
I'd like to help, but rather by explaining things. I'd not really like
to get involved in coding, as I have a lot of work on my hands already.
I do owe the gdal project a debt, though. I am creating extremely
detailed GPS maps of the Piedmont from government data using free
software only, and if it weren't for QGIS, gdal and GRASS, my work would
be very difficult indeed.
>>
>> 3.
>>
>> It's also possible to apply a low-pass filter no matter
>> if there is high-frequency content: if there is no high-frequency
>> content, a good low-pass filter has no effect.
>
> Would probably be faster from a computation point of view to just apply the
> filter than prior checking if it is needed. One point to have in mind is that
> for memory reasons, and also because users also want to process big images by
> chunks, GDAL operates by chunks and not on the whole image, so "infinite
> support" is at best limited to the source window being passed to the function.
Jukka's hint to use kernel-filtered sources in a VRT file does do the
trick and shows the way to go, but requires an intermediate file of the
same size as the initial input. The maths is already there. Doing the
step-1 low-pass this way is a reasonable choice, even if IIR filters may
perform better. If the process is done in chunks, you need to cut
overlapping chunks. The overlap has to be large enough to give support
for the calculation of the pixels at the edge of the actual tiles. When
using convolution, the size of the overlap has to be half the kernel
size, which is not too much. With IIR filters, you can choose how much
error you find acceptable and select the overlap size accordingly.
>
>> What has to be understood here is that the filter has to match the
>> desired scale change. The kernel has to be produced by a process taking
>> the scale change into account, which can be done for a gaussian by
>> choosing an appropriate standard deviation.
>
> For a down-to-earth approach, any recommended formulas to apply ? Does it
> depend only on the scaling factor ?
Theoretically, it does depend only on the scaling factor. But as I have
outlined above, digital filters are not perfect. They have transition
bands, pass-band and stop-band ripple - so, in reality, it depends. I
know this is a frustrating answer, but this is a fact of life. In the
real world, you have to compromise and find a solution which is 'good
enough' for what you have in mind.
When using gaussians for low-pass filtering, there is a rule of thumb,
but just now I can't seem to find it, sorry. I'll try and come up with
the rule of thumb, which gives a standard deviation which will remove a
'significant portion' of frequencies above a given limit. Given that
standard deviation, you can truncate the gaussian and then normalize it.
But there are many different low-pass filters, each with it's own
special characteristics, and gaussians aren't necessarily the ones which
are best suited for all tasks.
Don't you have a DSP specialist in your community? Would be nice to have
someone else joining the discussion. My DSP knowledge is mainly through
image processing, so I may miss fine points with geodata.
>
>>
>> Still with me? Thanks for your patience! I hope I have understood all
>> the source code and response posts correctly, don't hesitate to correct me.
>
> While we are at assessing correctness of the implementation, one point that
> has sometimes been raised is the potential impact of the "pixel-is-area"
> versus "pixel-is-point" convention. Typically an imaging sensor will reflect
> the number of photons collected over an area in a range of wave-length, so the
> value of the pixel applies to the whole area it covers. Other datasets might
> be the result of a variable evaluated at the center of the pixel. GDAL
> currently computes the distances to the pixel centre when computing the
> weights of the resampling function, so it would seem to me that it is the
> natural way for "pixel-is-point" datasets, and I'm wondering how good it is
> for "pixel-is-area" ones.
We could fill a whole new thread discussing this ;)
First the sensor. When it comes to images, you are right: the sensor has
an area. But it's a smallish area, definitely smaller than the area a
pixel 'occupies' later on (because the sensor chip has other stuff on
it's surface), and the incoming signal is already low-pass filtered by
the lambda filter, and then the light goes through a color filter to
produce a bayer pattern which only produces 'pixel' values after a very
complex process of demosaicking. All of this is so complex that it's
more of an art than a science. But if you accept the set of pixels as a
starting point, the pixel-as-area metaphor is usable just as the
pixel-as-point metaphor. If you use pixel-as-area and you subsample,
picking precisely the area which a target pixel would occupy in the
source image is a reasonable choice and much better than simply
averaging over all pixels-as-points which are nearer than a given
maximum distance. Now what you do with the data in this area is a
different matter. Averaging over the area is one possibility, but you
might do a weighted sum to get better results. Again, it's a matter of
what you want to achieve and what errors you can accept. There is no
silver bullet. It's a bit like the duality of light. Is it a wave or a
particle?
And when it comes to DEM data derived from LIDAR scans, which is what I
am currently working on, the pixel-as-area approach is very valid: the
LIDAR scan produces a point cloud, which is submitted to binning, so
that all measurements happening to occur in a given small rectangular
area are put together and averaged. Interpreting such a bin as something
which has an area over which the signal is uniform is, again, an
understandable choice, but it's not necessary. You might, for example,
use weights on the point measurements when binning them. Or use a median
filter or a specific quartile or or or to get data for a specific
purpose. It's important that you understand what you're doing and pick
your tools accordingly. And this is much easier said than done.
On top of that, using other people's data, you often don't precisely
know how they have already 'messed' with the data. pixel-as-area is one
way which is often quite close to a good choice. But it's important to
understand that 'area' interpolation is, as every other method, a
digital filter and produces certain errors. If that is okay with what
you want to achieve, it's okay to use it.
So what's the upshot? If you are writing software for a large audience
of people who mostly don't understand the fine points, you want to
shield them from the worst mistakes by making it hard for them to make
these mistakes. An automatic low-pass filter before downsampling is just
such a shield. And using area interpolation instead of the average over
contributing pixels would be another one. So having code in place to
perform these tasks is good, making them the default is sensible, and
allowing users to overrule the defaults is courteous to advanced users,
whom you don't want to put into a straightjacket of what you think is
'right'. And when it comes to b-splines, when they are used without
prefiltering the data, this should be said in the documentation, because
then it's no longer strictly speaking an interpolation but produces mild
blur.
Kay
More information about the gdal-dev
mailing list