Creating the image data

The first step is to create the image files. The code for this section is in the Chapter11/gen_cifar10_data.R folder. We will load the CIFAR10 data and save the image files in the data directory. The first step is to create the directory structure. There are 10 classes in the CIFAR10 dataset: we will save 8 classes for building a model and we will use 2 classes in a later section (Transfer learning). The following code creates the following directories under data:

This is the structure that Keras expects image data to be stored in. If you use this structure, then the images can be used to train a model in Keras. In the first part of the code, we create these directories:

library(keras)
library(imager)
# this script loads the cifar_10 data from Keras
# and saves the data as individual images

# create directories,
# we will save 8 classes in the data1 folder for model building
# and use 2 classes for transfer learning
data_dir <- "../data/cifar_10_images/"
if (!dir.exists(data_dir))
dir.create(data_dir)
if (!dir.exists(paste(data_dir,"data1/",sep="")))
dir.create(paste(data_dir,"data1/",sep=""))
if (!dir.exists(paste(data_dir,"data2/",sep="")))
dir.create(paste(data_dir,"data2/",sep=""))
train_dir1 <- paste(data_dir,"data1/train/",sep="")
valid_dir1 <- paste(data_dir,"data1/valid/",sep="")
train_dir2 <- paste(data_dir,"data2/train/",sep="")
valid_dir2 <- paste(data_dir,"data2/valid/",sep="")

if (!dir.exists(train_dir1))
dir.create(train_dir1)
if (!dir.exists(valid_dir1))
dir.create(valid_dir1)
if (!dir.exists(train_dir2))
dir.create(train_dir2)
if (!dir.exists(valid_dir2))
dir.create(valid_dir2)

Under each of the train and valid directories, a separate directory is used for each category. We save the images for 8 classes under the data1 folder, and save the images for 2 classes under the data2 folder:

# load CIFAR10 dataset
c(c(x_train,y_train),c(x_test,y_test)) %<-% dataset_cifar10()
# get the unique categories,
# note that unique does not mean ordered!
# save 8 classes in data1 folder
categories <- unique(y_train)
for (i in categories[1:8])
{
label_dir <- paste(train_dir1,i,sep="")
if (!dir.exists(label_dir))
dir.create(label_dir)
label_dir <- paste(valid_dir1,i,sep="")
if (!dir.exists(label_dir))
dir.create(label_dir)
}
# save 2 classes in data2 folder
for (i in categories[9:10])
{
label_dir <- paste(train_dir2,i,sep="")
if (!dir.exists(label_dir))
dir.create(label_dir)
label_dir <- paste(valid_dir2,i,sep="")
if (!dir.exists(label_dir))
dir.create(label_dir)
}

Once we have created the directories, the next step is to save the images in the correct directories, which we will do in the following code:

# loop through train images and save in the correct folder
for (i in 1:dim(x_train)[1])
{
img <- x_train[i,,,]
label <- y_train[i,1]
if (label %in% categories[1:8])
image_array_save(img,paste(train_dir1,label,"/",i,".png",sep=""))
else
image_array_save(img,paste(train_dir2,label,"/",i,".png",sep=""))
if ((i %% 500)==0)
print(i)
}

# loop through test images and save in the correct folder
for (i in 1:dim(x_test)[1])
{
img <- x_test[i,,,]
label <- y_test[i,1]
if (label %in% categories[1:8])
image_array_save(img,paste(valid_dir1,label,"/",i,".png",sep=""))
else
image_array_save(img,paste(valid_dir2,label,"/",i,".png",sep=""))
if ((i %% 500)==0)
print(i)
}

Finally, as we have done previously, we will do a validation check to ensure that our images are correct. Let's load in 9 images from one category. We want to check that the images display correctly and that they are from the same class:

# plot some images to verify process
image_dir <- list.dirs(valid_dir1, full.names=FALSE, recursive=FALSE)[1]
image_dir <- paste(valid_dir1,image_dir,sep="")
img_paths <- paste(image_dir,list.files(image_dir),sep="/")

par(mfrow = c(3, 3))
par(mar=c(2,2,2,2))
for (i in 1:9)
{
im <- load.image(img_paths[i])
plot(im)
}

This produces the following plot:

Figure 11.1: Sample CIFAR10 images

This looks good! The images display correctly and we can see that these images all appear to be of the same class, which is cars. The images are out of focus, but that is because they are only thumbnail images of size 32 x 32.