Convolution

Pascal Wallisch

The purpose of this chapter is to familiarize you with the convolution operation. You will use this operation in the context of receptive fields in the early visual system as input response filters whose convolution with an input image approximates certain aspects of your perception. Specifically, you will reproduce the Mach band illusion and explore the Gabor filter as a model for the receptive field of a simple cell in the primary visual cortex.

Keywords

convolution; difference of Gaussians function; classical receptive field; Mach band illusion; excitatory; inhibitory

16.2 Background

A convolution is the mathematical operation used to find the output y(t) of a linear time-invariant system from some input x(t) using the impulse response function of the system h(t), where h(t) is defined as the output of a system to a unit impulse input. It is defined as the following integral:

(16.1)

This can be graphically interpreted as follows. The function h(τ) is plotted on the τ-axis, as is the flipped and shifted function x(t−τ), where the shift t is fixed. These two signals are multiplied, and the signed area under the curve of the resulting function is found to obtain y(t). This operation is then repeated for every value of t in the domain of y. It turns out that it doesn’t matter which function is flipped and shifted since h * x=x * h.

You can also define a convolution for data in two dimensions:

(16.2)

Basically, you take a convolution in one dimension to establish the k dependence of the result y and then use that output (which is a function of k, t and τ) to perform another convolution in the second dimension. This second convolution provides the t dependence of the result, y. It is important that you understand how to apply this to a two-dimensional data function because in this chapter you will be working with two-dimensional images. In the MATLAB® software, since you are working with discrete datasets, the integral becomes a summation, so the definition for convolution in 2D at every point becomes

(16.3)

Again, this is easier to understand pictorially. What you are doing in this algorithm is taking the dataset x, which is a matrix; rotating it by 180 degrees; overlaying it at each point in the matrix h that describes the response filter; multiplying each point with the underlying point; and summing these points to produce a new point at that position. You do this for every position to get a new matrix that will represent the convolution of h and x.

16.2.1 The Visual System and Receptive Fields

In this section we discuss in general the anatomy of the visual system and the input response functions that explain how different areas of the brain involved in this system might “perceive” a visual stimulus.

Light information from the outside world is carried by photons that enter the eyes and cause a series of biochemical cascades to occur in rods and cones of the retina. This biochemical cascade causes channels to close which leads to a decrease in the release of neurotransmitter onto bipolar cells. In general, there are two fundamental varieties of bipolar cells. On-bipolar cells become depolarized in response to light and off-bipolar cells become hyperpolarized in response to light. The bipolar cells then project to the ganglion cells which are the output cells of the retina. The response to light in this main pathway is also influenced by both the horizontal and amacrine cells in the retina. There are many types of retinal ganglion cells that respond to different visual stimuli.

A stimulus in the visual field will elicit a cell’s response (above the background firing rate) only if it lies within a localized region of visual space, denoted by the cell’s classical receptive field. In general, the ganglion cells have a center-surround receptive field due to the types of cells that interact to send information to these neurons. That is, the receptive field is essentially two concentric circles, with the center having an excitatory increase (+) in neuronal activity in response to light stimulus and the surround having an inhibitory decrease (−) in neuronal activity in response to light stimulus, or vice versa. The response function of the ganglion cells can then be modeled using a Mexican hat function, also sometimes called a difference of Gaussians function.

In the main visual pathway, the ganglion cells send their axons to the lateral geniculate nucleus (LGN) in the thalamus, which is in charge of regulating information flow to the cortex. These cells also are thought to have receptive fields with a center-surround architecture. LGN cells project to the primary visual cortex (V1). In V1, simple cells are thought to receive information from LGN neurons in such a way that they respond to bars of light at certain orientations and spatial frequencies. This can similarly be described as a Gabor function—a two-dimensional Gaussian filter whose amplitude is modulated by a sinusoidal function along an axis at a given orientation. Thus, different simple cells in V1 respond to bars of light at specific orientations with specific widths (this represents spatial frequency; see Dayan and Abbott, 2001). These and other cells from V1 project to many other areas in the cortex thought to represent motion, depth, face recognition, and other fascinating visual features and perceptions.

16.2.2 The Mach Band Illusion

Using your knowledge of the receptive fields or the response functions of the visual areas can help you understand why certain optical illusions work. The Mach band illusion is a perceptual illusion seen when viewing an image that ramps from black to white. Dark and light bands appear on the image where the brightness ramp meets the black and white plateau, respectively. These bands are named after Ernst Mach, a German physicist who first studied them in the 1860s. They can be explained with the center-surround receptive fields of the ganglion or LGN cells (Ratliff, 1965; Sekuler and Blake, 2002); we will use this model in this chapter although alternative explanations exist (for example, see Lotto, Williams, and Purves, 1999).

The illusion is demonstrated in Figure 16.1. At the initiation of the stimulus brightness ramp, a dark band, darker than the dark plateau to the left, is usually perceived. At the termination of the brightness ramp, a light band is perceived brighter than the light plateau to its right. Figure 16.1 shows the center-surround receptive fields of sample neurons, represented by concentric circles, superimposed on the stimulus image. The center disk is excitatory, and the surrounding annulus is inhibitory, as indicated by the plus and minus signs. When the receptive field of a neuron is positioned completely within the areas of uniform brightness, the center receives nearly the same stimulation as the surround; thus, the excitation and inhibition are in balance. A receptive field aligned with the dark Mach band has more of its surround in a brighter area than the center, and the increased inhibition to the neuron results in the perception of that area as darker. Conversely, the excitation to a neuron whose receptive field is aligned with the bright Mach band is increased, since more of its center is in a brighter area than the surround. The decreased inhibition to such a neuron results in a stronger response than that of the neuron whose receptive field lies in the uniformly bright regime and thus the perception of the area as brighter.

Figure 16.1 The Mach band illusion. Top of figure: the visual stimulus with various center-surround receptive fields superimposed. Bottom of figure: the actual brightness of the visual stimulus (black solid line) and the perceived brightness of the optical illusion (blue dotted line).

16.3 Exercises

The goal for this chapter is to reproduce the Mach band optical illusion. First, you will create the visual stimulus. Then you will create a center-surround Mexican hat receptive field. Finally, you will convolve the stimulus with the receptive field filter to produce an approximation of the perceived brightness.

You begin by creating the M-file named ramp.m that will generate the visual input (see Figure 16.2). The input will be a 64×128 matrix whose values represent the intensity or brightness of the image. You want the brightness to begin dark, at a value of 10, for the first 32 columns. In the next 65 columns, the value will increase at a rate of one per column, and the brightness will stay at the constant value of 75 for the rest of the matrix. Open a new blank file and save it under the name ramp.m. In that file enter the following commands:

%ramp.m

% This script generates the image that creates the Mach band visual illusion.

In=10*ones(64,128); %initiates the visual stimulus with a constant value of 10

for ii=1:65

In(:,32+ii)=10+ii;

%ramps up the value for the middle matrix elements (column 33 to column 97)

end

In(:,98:end)=75; %sets the last columns of the matrix to the final brightness value of 75

figure

imagesc(In); colormap(bone); set(gca, ‘fontsize’,20) %view the visual stimulus

Figure 16.2 The brightness ramp stimulus used as visual input.

Notice how the function imagesc creates an image whose pixel colors correspond to the values of the input matrix In. You can play with the color representation of the input data by changing the colormap. Here, you use the colormap bone, since it is the most appropriate one for creating the optical illusion, but there are many more interesting options available that you can explore by reading the help file for the function colormap.

You’ve just created an M-file titled ramp that will generate the visual stimulus. Note, however, that you use a for loop in ramping up the brightness values. Although it doesn’t make much of a difference in this script, it is good practice to avoid using for loops when programming in MATLAB if possible, and to take advantage of its efficient matrix manipulation capabilities for faster run times (see Chapter 4.4.5.1, “Vectorizing Matrix Operations”). How might you eliminate the for loop in this case? One solution is to use the function cumsum. Let’s see what it can do:

>> z=ones(3,4)

z =

1 1 1 1
1 1 1 1
1 1 1 1

>> cumsum(z)

ans =

1 1 1 1
2 2 2 2
3 3 3 3

The function will cumulatively add the elements of the matrix by row, unless you specify that dimension along which to sum should be the second dimension, or by column:

>> cumsum(z,2)

ans =

1 2 3 4
1 2 3 4
1 2 3 4

You will want this cumulative sum by columns for this ramp function. Now rewrite the code in proper style for MATLAB without the for loop:

%ramp.m

% This script generates the image that creates the Mach band visual illusion.

In=10*ones(64,128); %initiates the visual stimulus with a constant value of 10

% now ramp up the value for the middle matrix elements using cumsum

In(:,33:97)=10+cumsum(ones(64,65),2);

In(:,98:end)=75; %sets the last columns of the matrix to the end value of 75

figure; imagesc(In); colormap(bone); set(gca, ‘fontsize’,20) %view the visual stimulus

You can look at how the values of the brightness increase from left to right by taking a slice of the matrix and plotting it, as shown in Figure 16.3. Look at the 32^nd row in particular.

>> plot(In(32,:),’k’,’LineWidth’,3); axis([0 128 0 85]); set(gca,’fontsize’,20)

Figure 16.3 The brightness values in a slice through the ramp stimulus shown in Figure 16.2.

Next, you will create a script titled mexican_hat.m that will generate a matrix whose values are a difference of Gaussians. For this exercise, you will make this a 5×5 filter, as shown in Figure 16.4.

% mexican_hat.m

% this script produces an N by N matrix whose values are

% a 2 dimensional Mexican hat or difference of Gaussians

N = 5; %matrix size is NXN

IE=6; %ratio of inhibition to excitation

Se=2; %variance of the excitation Gaussian

Si=6; %variance of the inhibition Gaussian

S = 500;%overall strength of Mexican hat connectivity

[X,Y]=meshgrid((1:N)-round(N/2));

% −floor(N/2) to floor(N/2) in the row or column positions (for N odd)

% −N/2+1 to N/2 in the row or column positions (for N even)

[THETA,R] = cart2pol(X,Y);

% Switch from Cartesian to polar coordinates

% R is an N*N grid of lattice distances from the center pixel

% i.e. R=sqrt((X).^2 + (Y).^2)+eps;

EGauss = 1/(2*pi*Se^2)*exp(-R.^2/(2*Se^2)); % create the excitatory Gaussian

IGauss = 1/(2*pi*Si^2)*exp(-R.^2/(2*Si^2)); % create the inhibitory Gaussian

MH = S*(EGauss-IE*IGauss); %create the Mexican hat filter

figure; imagesc(MH) %visualize the filter

title(‘mexican hat "filter"’,’fontsize’,22)

colormap(bone); colorbar

axis square; set(gca,’fontsize’,20)

Figure 16.4 A 5×5 Mexican hat spatial filter.

Now take a second look at some of the components of this script. The function meshgrid is used to generate the X and Y matrices whose values contained the x and y Cartesian coordinate values for the Gaussians:

>> X

X =

−2 −1 0 1 2
−2 −1 0 1 2
−2 −1 0 1 2
−2 −1 0 1 2
−2 −1 0 1 2

>> Y

Y =

−2 −2 −2 −2 −2
−1 −1 −1 −1 −1
  0   0   0   0   0
  1   1   1   1   1
  2   2   2   2   2

The function cart2pol converts the Cartesian coordinates X and Y into the polar coordinates R and THETA. You use this function to create the 5×5 matrix R whose values are the radial distance from the center pixel:

>> R

R =

2.8284 2.2361 2.0000 2.2361 2.8284

2.2361 1.4142 1.0000 1.4142 2.2361

2.0000 1.0000 0 1.0000 2.0000

2.2361 1.4142 1.0000 1.4142 2.2361

2.8284 2.2361 2.0000 2.2361 2.8284

The THETA variable is never used; however, it gives the polar angle in radians:

>> THETA

THETA =

−2.3562 −2.0344 −1.5708 −1.1071 −0.7854

−2.6779 −2.3562 −1.5708 −0.7854 −0.4636

3.1416 3.1416 0 0 0

2.6779 2.3562 1.5708 0.7854 0.4636

2.3562 2.0344 1.5708 1.1071 0.7854

Finally, you’re ready to generate the main script called mach_illusion.m to visualize how the Mexican hat function/center-surround receptive field of the neurons in the early visual system could affect your perception. In this simple model, the two-dimensional convolution of the input image matrix (generated by the ramp.m M-file) with the receptive field filter (generated by the mexican_hat.m M-file) gives an approximation to how the brightness of the image is perceived when filtered through the early visual system. This operation should result in a dip in the brightness perceived at the point where the brightness of the input just begins to increase and a peak in the brightness perceived at the point where the brightness of the input just stops increasing and returns to a steady value, consistent with the perception of Mach bands (see Figure 16.5). For a first pass, use the two-dimensional convolution function, conv2, that is built into MATLAB. As described in detail in the help section, this function will output a matrix whose size in each dimension is equal to the sum of the corresponding dimensions of the input matrices minus one. The edges of the output matrix are usually not considered valid because the value of those points have some terms contributing to the convolution sum which involved zeros padded to the edges of the input matrix. One way to deal with the problem of such edge effects is to reduce the size of the output image by trimming the invalid pixels off the border. You accomplish this by including the option ‘valid’ when calling the conv2 function:

%mach_illusion.m

clear all; close all

mexican_hat %creates the Mexican hat matrix, MH, & plots

ramp %creates image with ramp from dark to light, In, & plots

A=conv2(In,MH,’valid’); %convolve image and Mexican hat

figure; imagesc(A); colormap(bone) %visualize the "perceived" brightness

%create plot showing the profile of both the input and the perceived brightness

figure; plot(In(32,:),’k’,’LineWidth’,5); axis([0 128 −10 95])

hold on; plot(A(32,:),’b-.’,’LineWidth’,2); set(gca,’fontsize’,20)

lh=legend(‘input brightness’,’perceived brightness’,2); set(lh,’fontsize’,20)

Figure 16.5 The Mach band illusion generated using the Mexican hat filter on the ramp input.

Make sure that the mexican_hat.m and ramp.m M-files are in the same directory as the mach_illusion.m M-file. Note that the size of the output is indeed smaller than the input:

>> size(A)

ans =

60 124

For fun, you can learn more about how the convolution works by changing the ‘valid’ option in the conv2 function call to either ‘full’ or ‘same’ and see how the output matrix A changes. One way to minimize the edge effects of convolution is to pad the input matrix with values that mirror the edges of the input matrix before performing the two-dimensional convolution and returning only the valid part of the output, which will now be the size of the original input matrix. The function conv2mirrored.m will do just this trick. It has been written in a generic form to accept matrices of any size:

%conv2_mirrored.m

function sp = conv2_mirrored(s,c)

% 2D convolution with mirrored edges to reduce edge effects

% output of convolution is same size as leading input matrix

[N,M]=size(s);

[n,m]=size(c); %% both n & m should be odd

% enlarge matrix s in preparation for convolution with matrix c

%via mirroring edges to reduce edge effects.

padn = round(n/2) - 1;

padm = round(m/2) - 1;

sp=[zeros(padn,M+(2*padm)); zeros(N,padm) s zeros(N,padm); zeros(padn,M+(2*padm))];

sp(1:padn,:)=flipud(sp(padn+1:2*padn,:));

sp(padn+N+1:N+2*padn,:)=flipud(sp(N+1:N+padn,:));

sp(:,1:padm)=fliplr(sp(:,padm+1:2*padm));

sp(:,padm+M+1:M+2*padm)=fliplr(sp(:,M+1:M+padm));

% perform 2D convolution

sp = conv2(sp,c,’valid’);

Exercise 16.1

Put the figures generated by the mach_illusion.m script into a document, explain each figure, and give a short summary of the Mach band illusion as you understand it.

Exercise 16.2

Rather than cumsum, you could have also used the function meshgrid to efficiently ramp up the brightness values from dark to light when creating the matrix In. Read the help file for meshgrid and rewrite the ramp.m script using meshgrid rather than cumsum.

Exercise 16.3

Create the function conv2_mirrored.m using the code provided previously and place it in the same directory as your other files. Learn how the mirroring of the edges of the input matrix is accomplished by reviewing the help files on the functions flipud and fliplr. What determines the size of the mirrored-edge padding necessary and why? Rewrite your main script mach_illusion.m to use this convolution function rather than the conv2 function. Check that your output matrix A is now the same size as the input matrix In.

Exercise 16.4

Change the slope of the ramp without changing the beginning or ending values of the input image. [Hint: The command linspace can be useful to find values of the ramp that will go from 10 to 75 in, say, 30 steps rather than 65: linspace(10,75,30).] How does increasing or decreasing the slope affect the strength of the illusion?

Exercise 16.5

Convert the M-file named mexican_hat.m into a function where the inputs are the size of the matrix, the ratio of excitation to inhibition, the variance of excitatory and inhibitory Gaussians, and the overall strength of the filter. Also make the appropriate changes to the main script that calls this function, mach_illusion.m.