Chapter 5, Convolutional Neural Networks, covered the theory behind CNNs, and convolution of course has been part of that presentation. Let's do a recap of this concept from a mathematical and practical perspective before moving on to object recognition. In mathematics, convolution is an operation on two functions that produces a third function, which is the result of the integral of the product between the first two, one of which is flipped:
Convolution is heavily used in 2D image processing and signal filtering.
To better understand what happens behind the scenes, here's a simple Python code example of 1D convolution with NumPy (http://www.numpy.org/):
import numpy as np
x = np.array([1, 2, 3, 4, 5])
y = np.array([1, -2, 2])
result = np.convolve(x, y)
print result
This produces the following result:
Let's see how the convolution between the x and y arrays produces that result. The first thing the convolve function does is to horizontally flip the y array:
[1, -2, 2] becomes [2, -2, 1]
Then, the flipped y array slides over the x array:
That's how the result array [ 1 0 1 2 3 -2 10] is generated.
2D convolution happens with a similar mechanism. Here's a simple Python code example with NumPy:
import numpy as np
from scipy import signal
a = np.matrix('1 3 1; 0 -1 1; 2 2 -1')
print(a)
w = np.matrix('1 2; 0 -1')
print(w)
f = signal.convolve2d(a, w)
print(f)
This time, the SciPy (https://www.scipy.org/) signal.convolve2d function is used to do the convolution. The result of the preceding code is as follows:
When the flipped matrix is totally inside the input matrix, the results are called valid convolutions. It is possible to calculate the 2D convolution, getting only the valid results this way, as follows:
f = signal.convolve2d(a, w, 'valid')
This will give output as follows:
Here's how those results are calculated. First, the w array is flipped:
becomes
Then, the same as for the 1D convolution, each window of the a matrix is multiplied, element by element, with the flipped w matrix, and the results are finally summed as follows:
(1 x -1) + (0 x 3) + (0 x 2) + (-1 x 1) = -2
(3 x -1) + (1 x 0) + (-1 x 2) + (1 x 1) = -4
And so on.