LESSON

Introducing kernels

Transcript

Let’s look again at the averaging process that we introduced a segment or two ago. Here we have our original image of the Mona Lisa and here we have the version of the Mona Lisa where we have applied an averaging over a 21 by 21 pixel input window. What can happen when we perform the average over a square window it leads to an artefact in the image called ringing. It introduces very faint vertical and horizontal lines and we can see a few above her lips and above and below her eyes.

This problem arises because the pixels in the input window are different distances away from the centre of the window. In particular, this pixel here is the distance of 2 away from the centre of the window where this pixel here is a distance 2√2 away from the centre. That is a value of almost 3 pixels away from the centre. And the result is that the average is unduly influenced by pixel values that are a long way from the centre. And we say that this operation is not isotropic; that means that it has different results in different directions. It is not symmetrical.

Ideally what we would like to do is to extract a circular region of pixels; we would like to take all of the pixels that fall underneath this circular disk. But the problem with this is that the disk doesn’t fully cover all of the pixels; some of the pixels around the edge — particularly the ones in the corner — are only partially covered by the disk. So that involves us taking a fraction of a pixel. Let’s look at a way we can do this.

And the way we do it is to apply a weighting. Let’s zoom in on this circular region that we have created and it’s a circle with a diameter of 2.5 pixels. And we want to take all of this pixel because it is fully covered by the disk. But we only want to take a fraction of this pixel; we want it to have less significance or less weight in the average that we are going to compute.
So what we can do is to compute a set of weights. For pixels that are fully covered by the disk, the weight is equal to 1. But if the disk only partly covers the pixel then the weight is less than one and we can see that there are some pixels 98% covered by the disk and in the corners we can see that some pixels are only 14% covered by the disk. This matrix then represents the influence that the different pixels within the square window should have on the resulting average. How do we use this weighting matrix?

What we are going to do is to take all of the pixels from the input image and here is an example of a set of pixel values. And we are going to multiply all of the pixels within the window by the weighting matrix. And this is an element wise multiplication, sometimes called a Hilbert multiplication. So we take corresponding elements from these two matrices, multiply them together, put the result into the corresponding element in the output matrix. So over here we have the pixel value from the original image, but we have multiplied them by the weighting matrix so that the ones in the corners have got a lower value, less able to influence the resulting average. Now we will compute the sum over this product and that is the product of all of the pixels in the input window which we denote by W and all of the values in the weighting matrix. And I am going to now refer to that weighting matrix as a kernel.

Kernel is a term that is commonly used in image processing. And we are going to compute the sum of the element wise multiplication of these two small square images and the result is what we place into the output image. So our function f(W) is equal to the sum of the element wise multiplication of the input window and the kernel.

Consider now that I have an input image where all of the pixels are uniform and equal to 1. If I applied this kernel to that image, the result will be that every pixel in the output image would have a value of 19.6. Input image, the values are all 1, output image are all 19.6. And we can get around this by applying a scale factor, and a scale factor is determined from the sum of all of the elements within the kernel; if we divide the kernel by the sum of all of its elements we have a normalised kernel and if we apply this to an image, where all of the input pixels have got a value of 1, the output image will have all of its pixels equal to one. This is a normalised kernel, so typically we try and compute a kernel that has got a scale factor of 1.

So the simple averaging that we looked at right at the beginning of this lecture we can also think of it as a kernel. In fact, it is a very simple kernel where all of the elements in the window are equal to 1. So here is the example for a 7 by 7 window. The 7 by 7 matrix of ones, we divide it by 49 so that the overall scale factor is equal to 1. So this is a fairly uniform way of dealing with spatial operations — they can all be represented by a kernel matrix. And in the MATLAB tool box we use a function called iconv. We pass in the image and in this case the Mona Lisa image. The second argument is the kernel matrix in this case it is a 7 by 7 matrix full of ones, divided by 49. The function ones is a built-in MATLAB function.

Another kernel that is commonly used in image processing is the Gaussian kernel, named after the very famous German mathematician Carl-Friedrich Gauss. You have perhaps encountered the one dimensional version of this function before; it is the normal distribution that we see in statistics. This function has got a simple and elegant analytic form and if we plot it in three dimensions it has the appearance of a gentle hill. If, instead, we represent this function as an image where brightness is proportional to height of the function shown over on the left, we have an image that looks something like this. It is bright in the middle and we can see that the intensity falls off uniformly in all directions. This is an isotropic function and the weight decreases as we move away from the centre. In the MATLAB tool box, we can compute this kernel using the function kgauss. The first argument is the standard deviation, the symbol σ on the left, and the second argument is the half width of the kernel. In this case, the half width is 15 so therefore the full width is 2x15 + 1; that is, a kernel that is 31 by 31 pixels.

An important parameter of the kernel is the standard deviation, the symbol σ, which appears twice in the analytic expression. It controls the width of the kernels, so for a σ of 2 we see that there is a very tall, pointy peak; for a σ of 5 it is much broader; for σ 10 it is broader still. The area under the Gaussian function is always one. So as it gets wider it becomes less tall. An important consideration when we use the Gaussian for image processing is what size should the kernel be. The kernel needs to be big enough to hold the bulk of the Gaussian so, and that clearly is going to depend on the standard deviation. So a pretty good rule of thumb is to make the half width of the kernel equal to three times the standard deviation.

Let’s create one of these Gaussian kernels; I am going to put it into a variable called K. It is the kgauss function and I want it to have a standard deviation of 5 and I wanted to be in a window with a half width of 15. So that will actually be 31 by 31 kernel. There we go. We have created a 31 by 31 matrix in the workspace which contains the Gaussian kernel. Now I can display this as an image.

And there we see it is essentially a disk, it is bright in the middle, it has got a high weighting to values in the centre of the window and lower weighting for pixels that are further away from the centre of the window.

Now I can create in a new figure. And I can look at it in a different way; I can look at it in a three dimensional way by creating a MATLAB surface plot. And there we see the Gaussian window, very distinctive shape. Tall in the middle and with the decreasing values as we go away from the centre.

Let’s apply this kernel to a standard test image, so we will load the Mona Lisa image.
I will create a new image called S for smooth and I will correlate it with the kernel. There we have created the output image. I can display that and we can see that the image has been smoothed; we might also say it looks a bit blurry. Looks like it has lost some of its resolution. What we’ve done is smooth out the very fine scale detail and left just the coarse detail in the image.

And you can see that for the Kσ equals 2, we have a relatively small kernel size; for σ equals 5, we have to use a much larger kernel.

Taking an average of pixels in a box leads to artefacts such as ringing which we can remedy by taking a weighted average of all the pixels in the box surrounding the input pixel. The set of weights is referred to as a kernel. A common kernel used for image smoothing is the Gaussian kernel.

Professor Peter Corke

Professor of Robotic Vision at QUT and Director of the Australian Centre for Robotic Vision (ACRV). Peter is also a Fellow of the IEEE, a senior Fellow of the Higher Education Academy, and on the editorial board of several robotics research journals.

Skill level

This content assumes an understanding of high school level mathematics; for example, trigonometry, algebra, calculus, physics (optics) and experience with MATLAB command line and programming, for example workspace, variables, arrays, types, functions and classes.

More information...

Rate this lesson

Average

Check your understanding

Leave a comment