Loading the data

As usual, first we add some magic to display images inline in the Jupyter:

%matplotlib inline

We're using Pandas to handle our data:

import pandas

Please, visit the Kaggle site and download the dataset: https://www.kaggle.com/c/challenges-in-representation-learning-facial-expression-recognition-challenge

Load the dataset into the memory:

data = pandas.read_csv("fer2013/fer2013.csv")

Dataset consists of gray scale face photos encoded as pixel intensities. 48 x 48 gives 2304 pixels for each. Every image is marked according to the emotion on the face.

data.head() 
emotion  pixels   Usage 
0  0  70 80 82 72 58 58 60 63 54 58 60 48 89 115 121...  Training 
1  0  151 150 147 155 148 133 111 140 170 174 182 15...  Training 
2  2  231 212 156 164 174 138 161 173 182 200 106 38...  Training 
3  4  24 32 36 30 32 23 19 20 30 41 21 22 32 34 21 1...  Training 
4  6  4 0 0 0 0 0 0 0 0 0 0 0 3 15 23 28 48 50 58 84...  Training 
How many faces of each class do we have? 
 
data.emotion.value_counts() 
3    8989 
6    6198 
4    6077 
2    5121 
0    4953 
5    4002 
1     547 
Name: emotion, dtype: int64

Here 0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, and 6=Neutral.

Let's remove Disgust, as we have too little samples for it:

data = data[data.emotion != 1] 
data.loc[data.emotion > 1, "emotion"] -= 1 
data.emotion.value_counts() 
2    8989 
5    6198 
3    6077 
1    5121 
0    4953 
4    4002 
Name: emotion, dtype: int64 
emotion_labels = ["Angry", "Fear", "Happy", "Sad", "Surprise", "Neutral"] 
num_classes = 6

This is how samples are distributed among training and test. We'll be using training to train the model and everything else will go to test set:

data.Usage.value_counts() 
Training       28273 
PrivateTest     3534 
PublicTest      3533 
Name: Usage, dtype: int64

The size of images and the number of channels (depth):

from math import sqrt 
depth = 1 
height = int(sqrt(len(data.pixels[0].split()))) 
width = int(height) 
height 
48

Let's see some faces:

import numpy as np 
import scipy.misc 
from IPython.display import display 
for i in xrange(0, 5): 
    array = np.mat(data.pixels[i]).reshape(48, 48) 
    image = scipy.misc.toimage(array, cmin=0.0) 
    display(image) 
    print(emotion_labels[data.emotion[i]]) 
 
//Images are being shown in the notebook

Many faces have ambiguous expressions, so our neural network will have a hard time classifying them. For example, the first face looks surprised or sad, rather than angry, and the second face doesn't look angry at all. Nevertheless, this is the dataset we have. For the real application, I would recommend collecting more samples of higher resolution, and then annotating them such that every photo is annotated several times by different independent annotators. Then, remove all photos that were annotated ambiguously.