论文阅读之 Conditional Generative Adversarial Nets
Summary
在上一篇文章中介绍了最简单的GAN模型, 这篇文章将介绍CGAN.
最基本的GAN模型是一个很简单的模型, 但是能够生成比较真实的图像. 但是基本的GAN模型不能控制输出的结果. CGAN的作者提出如果我们在训练的过程中添加一些条件信息, 我们将能够控制GAN模型的输出结果. 这些条件信息最常用的就是类别标签. 在这篇论文中, 作者通过使用类别标签实现了控制生成器生成指定类别的手写数字.
在给定了一定的条件信息的情况下, 模型的目标函数变成了如下形式:
CGAN的结构如下:
Code
代码参照论文中叙述的结构:
generator:
In the generator net, a noise prior z with dimensionality 100 was drawn from a uniform distribution within the unit hypercube. Both z and y are mapped to hidden layers with Rectified Linear Unit (ReLu) activation, with layer sizes 200 and 1000 respectively, before both being mapped to second, combined hidden ReLu layer of dimensionality 1200. We then have a final sigmoid unit layer as our output for generating the 784-dimensional MNIST samples.
discriminator:
The discriminator maps x to a maxout layer with 240 units and 5 pieces, and y to a maxout layer with 50 units and 5 pieces. Both of the hidden layers mapped to a joint maxout layer with 240 units and 4 pieces before being fed to the sigmoid layer. (The precise architecture of the discriminator is not critical as long as it has sufficient power; we have found that maxout units are typically well suited to the task.)
- Generator
def generator(z, y, training=True):
h1_z = fully_connect(z, 200, name='g_h1_z_fc')
h1_z = tf.nn.relu(tf.layers.batch_normalization(h1_z, training=training, name='g_h1_z_bn'))
h1_y = fully_connect(y, 1000, name='g_h1_y_fc')
h1_y = tf.nn.relu(tf.layers.batch_normalization(h1_y, training=training, name='g_h1_y_bn'))
h1 = tf.concat([h1_z, h1_y], axis=1)
h2 = tf.nn.dropout(h1, keep_prob=0.5)
h2 = fully_connect(h2, 784, name='g_h2_fc')
return tf.nn.sigmoid(h2)
- discriminator
def discriminator(images, y, reuse=False, training=True):
if reuse:
tf.get_variable_scope().reuse_variables()
h1_img = maxout(images, 240, name='d_h1_img_maxout')
h1_y = maxout(y, 50, name='d_h1_y_maxout')
h1 = tf.concat([h1_img, h1_y], axis=1)
h1 = tf.nn.dropout(h1, keep_prob=0.5)
h2 = maxout(h1, 240, name='d_h2_maxout', pieces=4)
h2 = tf.nn.dropout(h2, keep_prob=0.5)
h3 = fully_connect(h2, 1, name='d_h3_fc')
return tf.nn.sigmoid(h3), h3
- maxout layer
def maxout(input, channels_out, name, pieces=5):
channels_in = input.get_shape().as_list()[-1]
with tf.variable_scope(name):
w = weight_variable([channels_in, channels_out, pieces], name='weights')
b = bias_variable([channels_out, pieces], name='bias')
return tf.reduce_max(tf.tensordot(input, w, axes=1) + b, axis=-1)