Understanding cGANs

cGANs are a type of GAN that are conditioned on some extra information. We feed the extra information y to the generator as an additional input layer. In vanilla GANs, there is no control over the category of the generated images. When we add a condition y to the generator, we can generate images of a specific category, using y, which might be any kind of data, such as a class label or integer data. Vanilla GANs can learn only one category and it is extremely difficult to architect GANs for multiple categories. A cGAN, however, can be used to generate multi-modal models with different conditions for different categories.

The architecture of a cGAN is shown in the following diagram:

The training objective function for cGANs can be expressed as follows:

Here, G is the generator network and D is the discriminator network. The loss for the discriminator is and the loss for the generator is . We can say the G(z|y) is modeling the distribution of our data given z and y. Here, z is a prior noise distribution of a dimension of 100 drawn from a normal distribution.