【发布时间】:2020-03-19 18:03:52
【问题描述】:
我正在尝试将 Python DCGAN MNIST 代码实验室示例 (https://www.tensorflow.org/tutorials/generative/dcgan) 移植到 Tensorflow.js。生成器模型应该能够创建类似于 MNIST 样本数据的手写数字图像。
我的代码运行没有错误,但我面临两个主要问题。
- 训练过程比 Python 示例慢得多。例如,浏览器中的 JS 与在 Google 代码实验室中运行 Python 示例。
- 我的生成器模型永远无法真正生成手写数字。
它学习到生成网格状图像的程度,但似乎从来没有学到太多东西。
我相信这些模型是 1:1 端口。这是我的模型。
// discriminator model
let dModel = tf.sequential();
const IMAGE_WIDTH = 28;
const IMAGE_HEIGHT = 28;
const IMAGE_CHANNELS = 1;
dModel.add(
tf.layers.conv2d({inputShape: [IMAGE_WIDTH, IMAGE_HEIGHT, IMAGE_CHANNELS], kernelSize: [5,5], filters: 64, strides: [2,2], activation: "relu",
kernelInitializer: "varianceScaling"
})
);
dModel.add(tf.layers.leakyReLU())
dModel.add(tf.layers.dropout(0.3))
dModel.add(
tf.layers.conv2d({kernelSize: [5,5], filters: 128, strides: [2,2],
activation: "relu", kernelInitializer: "varianceScaling"
})
);
dModel.add(tf.layers.leakyReLU())
dModel.add(tf.layers.dropout(0.3))
dModel.add(tf.layers.flatten());
const NUM_OUTPUT_CLASSES = 1;
dModel.add(tf.layers.dense({units: NUM_OUTPUT_CLASSES}))
// generator model
let gModel = tf.sequential();
gModel.add(tf.layers.dense({units: 7 * 7 * 256,inputShape: [100], useBias: false}));
gModel.add(tf.layers.batchNormalization());
gModel.add(tf.layers.leakyReLU());
gModel.add(tf.layers.reshape({ targetShape: [7, 7, 256] }));
gModel.add(tf.layers.conv2dTranspose({filters: 128, kernelSize: [5, 5], strides: [1, 1], useBias: false, padding: "same"}));
gModel.add(tf.layers.batchNormalization());
gModel.add(tf.layers.leakyReLU());
gModel.add(tf.layers.conv2dTranspose({filters: 64, kernelSize: [5, 5], strides: [2, 2], useBias: false,padding: "same" }));
gModel.add(tf.layers.batchNormalization());
gModel.add(tf.layers.leakyReLU());
gModel.add(tf.layers.conv2dTranspose({filters: 1,kernelSize: [5, 5], strides: [2, 2], useBias: false,padding: "same", activation: "tanh" }));
损失函数是我找不到与 Gradient Tape 等效的 JS 的地方,所以我对它们进行了一些不同的设计。
Python 示例使用:
cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True)
def discriminator_loss(real_output, fake_output):
real_loss = cross_entropy(tf.ones_like(real_output), real_output)
fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output)
total_loss = real_loss + fake_loss
return total_loss
def generator_loss(fake_output):
return cross_entropy(tf.ones_like(fake_output), fake_output)
def train_step(images):
noise = tf.random.normal([BATCH_SIZE, noise_dim])
with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
generated_images = generator(noise, training=True)
real_output = discriminator(images, training=True)
fake_output = discriminator(generated_images, training=True)
gen_loss = generator_loss(fake_output)
disc_loss = discriminator_loss(real_output, fake_output)
gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables)
gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_variables)
generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables))
discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))
我在哪里使用了 optimizer.minimize。我不确定这是否可能过度训练鉴别器并导致问题。
即使在损失函数中重复调用 model.predict,我还是这样做了,否则我收到错误 Please make sure the operations that use variables are inside the function f passed to minimize()
function trainStep() {
const noise = tf.randomNormal([BATCH_SIZE, 100])
const fakeLabels = tf.ones([BATCH_SIZE], 'int32')
const realLabels = tf.zeros([BATCH_SIZE], 'int32')
const dLossCalc = () => {
const fakeImages = gModel.predict(noise).add(1).div(2)
let realImages = data.nextTrainBatch(BATCH_SIZE).xs
realImages = realImages.reshape([BATCH_SIZE, IMAGE_WIDTH, IMAGE_HEIGHT, 1])
realImages = realImages.sub(127.5).div(127.5) //normalize to 1, -1
const fakeLogits = dModel.predict(fakeImages).reshape([BATCH_SIZE])
const realLogits = dModel.predict(realImages).reshape([BATCH_SIZE])
const fakeLoss = tf.losses.sigmoidCrossEntropy(fakeLabels.mul(0.98), fakeLogits)
const realLoss = tf.losses.sigmoidCrossEntropy(realLabels, realLogits)
const totalLoss = fakeLoss.add(realLoss)
console.log('Disc Loss ' + totalLoss.dataSync())
return totalLoss
}
const gLossCalc = () => {
const fakeImages = gModel.predict(noise).add(1).div(2)
const logits = dModel.predict(fakeImages).reshape([BATCH_SIZE])
const loss = tf.losses.sigmoidCrossEntropy(fakeLabels, logits)
console.log('Gen Loss ' + loss.dataSync())
return loss
}
dOptimizer.minimize(dLossCalc)
gOptimizer.minimize(gLossCalc)
}
在这一点上,我已经花了几个小时,希望能得到任何帮助。
我找不到等效的两个主要内容是 Gradient Tape / Apply Gradients 和 tf.keras.losses.BinaryCrossentropy 损失函数。我正在使用 sigmoidCrossEntropy。
如果有人愿意看一下,这是一个完整的 codepen 示例: https://codepen.io/freeman-g/pen/KKpRyyX?editors=0010
作为旁注,我注意到 Tensorflow.js API 文档中没有记录 applyGradients 并打开了相关的 GitHub 问题:https://github.com/tensorflow/tfjs/issues/2897
【问题讨论】:
标签: tensorflow.js