The discriminator network

The architecture of the discriminator network in pix2pix is inspired by the architecture of the PatchGAN network. The PatchGAN network contains eight convolutional blocks as follows:

Layer Name


Input Shape

Output Shape

1st 2D Convolution Layer

filters=64, kernel_size=4, strides=2, padding='same',

(256, 256, 1)

(256, 256, 64)

Activation Layer

activation='leakyrelu', alpha=0.2

(128, 128, 64)

(128, 128, 64)

2nd 2D Convolution Layer

filters=128, kernel_size=4, strides=2, padding='same',

(128, 128, 64)

(64, 64, 128)

Batch Normalization Layer


(64, 64, 128)

(64, 64, 128)

Activation Layer

activation='leakyrelu', alpha=0.2

(64, 64, 128)

(64, 64, 128)

3rd 2D Convolution Layer

filters=256, kernel_size=4, strides=2, padding='same',

(64, 64, 128)

(32, 32, 256)

Batch Normalization Layer


(32, 32, 256)

(32, 32, 256)

Activation Layer

activation='leakyrelu', alpha=0.2

(32, 32, 256)

(32, 32, 256)

4th 2D Convolution Layer

filters=512, kernel_size=4, strides=2, padding='same',

(32, 32, 256)

(16, 16, 512)

Batch Normalization Layer


(16, 16, 512)

(16, 16, 512)

Activation Layer

activation='leakyrelu', alpha=0.2

(16, 16, 512)

(16, 16, 512)

5th 2D Convolution Layer

filters=512, kernel_size=4, strides=2, padding='same',

(16, 16, 512)

(8, 8, 512)

Batch Normalization Layer


(8, 8, 512)

(8, 8, 512)

Activation Layer

activation='leakyrelu', alpha=0.2

(8, 8, 512)

(8, 8, 512)

6th 2D Convolution Layer

filters=512, kernel_size=4, strides=2, padding='same',

(8, 8, 512)

(4, 4, 512)

Batch Normalization Layer


(4, 4, 512)

(4, 4, 512)

Activation Layer

activation='leakyrelu', alpha=0.2

(4, 4, 512)

(4, 4, 512)

7th 2D Convolution Layer

filters=512, kernel_size=4, strides=2, padding='same',

(4, 4, 512)

(2, 2, 512)

Batch Normalization Layer


(2, 2, 512)

(2, 2, 512)

Activation Layer

activation='leakyrelu', alpha=0.2

(2, 2, 512)

(2, 2, 512)

8th 2D Convolution Layer

filters=512, kernel_size=4, strides=2, padding='same',

(4, 4, 512)

(1, 1, 512)

Batch Normalization Layer


(1, 1, 512)

(1, 1, 512)

Activation Layer

activation='leakyrelu', alpha=0.2

(1, 1, 512)

(1, 1, 512)

Flatten Layer


(1, 1, 512)

(512, )

Dense Layer

units=2, activation='softmax'

(1, 1, 512)

(2, )


This table highlights the architecture and the configuration of the discriminator network. A flatten layer flattens the tensor to a one-dimensional array. 

The remaining layers in the discriminator network are covered in the The Keras implementation of pix2pix section of this chapter.

We have now explored the architecture and configuration of both networks. We will now explore the training objective function that's required to train pix2pix.