Keras 自动编码器简单示例有一个奇怪的输出答案

【问题标题】：Keras autoencoder simple example has a strange outputKeras 自动编码器简单示例有一个奇怪的输出
【发布时间】：2017-11-19 22:43:00
【问题描述】：

我正在尝试运行一个简单的自动编码器，所有的训练输入都是相同的。训练数据特征等于3，隐藏层有3个节点。我用该输入训练自动编码器，然后我尝试再次预测它（编码/解码）（因此，如果自动编码器按原样传递所有内容而没有任何更改，它应该可以工作）

无论如何，情况并非如此，我有点难以理解为什么。我不确定我的代码是否有问题，或者我对 autoencdoer 实现的理解是否有问题。这是代码供参考。

附：我调整了 epoch 的数量、训练集中的示例数量、批量大小，使训练数据值介于 0-1 之间，并跟踪损失值，但这也无济于事。

from keras.layers import Input, Dense
from keras.models import Model
import numpy as np 
# this is the size of our encoded representations
encoding_dim = 3

x_train=np.array([[1,2,3],[1,2,3],[1,2,3],[1,2,3],[1,2,3],[1,2,3],[1,2,3],[1,2,3],[1,2,3]])
in= Input(shape=(3,))
encoded = Dense(encoding_dim, activation='relu')(in)
decoded = Dense(3, activation='sigmoid')(encoded)

# this model maps an input to its reconstruction
autoencoder = Model(in, decoded)
autoencoder.compile(optimizer='adadelta', loss='mse')

autoencoder.fit(x_train, x_train,
                epochs=100,
                batch_size=4)
autoencoder.predict(x_train)

我得到的输出应该与输入相同（或至少接近），但我得到的是这个）

`Out[180]: 
array([[ 0.80265796,  0.89038897,  0.9100889 ],
       [ 0.80265796,  0.89038897,  0.9100889 ],
       [ 0.80265796,  0.89038897,  0.9100889 ],
       ..., 
       [ 0.80265796,  0.89038897,  0.9100889 ],
       [ 0.80265796,  0.89038897,  0.9100889 ],
       [ 0.80265796,  0.89038897,  0.9100889 ]], dtype=float32)`

任何帮助将不胜感激，很可能我理解错了，所以希望这个问题不是那么难回答。

【问题讨论】：

标签： python deep-learning keras autoencoder

【解决方案1】：

您当然可以使用顺序模型在 Keras 中构建自动编码器。所以我不确定that the example you are referring to 是不是你可以创建的“最简单的自动编码器”，正如文章作者所说的那样。以下是我的做法：

from keras.models                   import Sequential
from keras.layers                   import Dense 

import numpy as np 

# this is the size of our encoded representations
encoding_dim = 3

np.random.seed(1)  # to ensure the same results

x_train=np.array([[1,2,3],[1,2,3],[1,2,3],[1,2,3],[1,2,3],[1,2,3],[1,2,3],[1,2,3],[1,2,3]])

autoencoder = Sequential([ 
              Dense(encoding_dim,input_shape=(3,)), 
              Dense(encoding_dim)
])

autoencoder.compile(optimizer='adadelta', loss='mse')

autoencoder.fit(x_train, x_train,
            epochs=127,
            batch_size=4, 
            verbose=2)

out=autoencoder.predict(x_train)
print(out)

运行此示例时，您会得到

 ....
 Epoch 127/127
 - 0s - loss: 1.8948e-14
[[ 1.  2.  3.]
 [ 1.  2.  3.]
 [ 1.  2.  3.]
 [ 1.  2.  3.]
 [ 1.  2.  3.]
 [ 1.  2.  3.]
 [ 1.  2.  3.]
 [ 1.  2.  3.]
 [ 1.  2.  3.]]

这样挺好的……

【讨论】：

【解决方案2】：

您的输入数据未标准化。如下标准化后，您可以获得正确的输出。

x_train=np.array([[1,2,3],[1,2,3],[1,2,3],[1,2,3],[1,2,3],[1,2,3],[1,2,3],[1,2,3],[1,2,3]])
x_train=keras.utils.normalize(x_train)  #newly added line
 ....
 ....

【讨论】：

不是问题

【解决方案3】：

应用@danche 建议后，以下是更新的代码和结果，我在增加 epocs = 10000 后得到了结果

from keras.layers import Input, Dense
from keras.models import Model
import numpy as np
# this is the size of our encoded representations
encoding_dim = 3

x_train=np.array([[1,2,3],[1,2,3],[1,2,3],[1,2,3],[1,2,3],[1,2,3],[1,2,3],[1,2,3]])
input = Input(shape=(3,))
encoded = Dense(encoding_dim, activation='relu')(input)
decoded = Dense(3, activation='linear')(encoded)

# this model maps an input to its reconstruction
autoencoder = Model(input, decoded)
autoencoder.compile(optimizer='adadelta', loss='mse')

autoencoder.fit(x_train, x_train,epochs=10000,batch_size=4)
print(autoencoder.predict(x_train))



Epoch 10000/10000
8/8 [==============================] - 0s - loss: 2.4463e-04     
[[ 0.99124289  1.98534203  2.97887278]
 [ 0.99124289  1.98534203  2.97887278]
 [ 0.99124289  1.98534203  2.97887278]
 [ 0.99124289  1.98534203  2.97887278]
 [ 0.99124289  1.98534203  2.97887278]
 [ 0.99124289  1.98534203  2.97887278]
 [ 0.99124289  1.98534203  2.97887278]
 [ 0.99124289  1.98534203  2.97887278]]

【讨论】：

【解决方案4】：

错误在这里decoded = Dense(3, activation='sigmoid')(encoded)。

您不应该使用sigmoid 激活，因为它会将输出限制在 (0, 1) 范围内，将 sigmoid 替换为 linear 或删除它，您可以添加更多的 epoch，例如训练 1000 个 epoch。在这种环境下，我得到你需要的东西

[[ 0.98220336  1.98066235  2.98398876]
 [ 0.98220336  1.98066235  2.98398876]
 [ 0.98220336  1.98066235  2.98398876]
 [ 0.98220336  1.98066235  2.98398876]
 [ 0.98220336  1.98066235  2.98398876]
 [ 0.98220336  1.98066235  2.98398876]
 [ 0.98220336  1.98066235  2.98398876]
 [ 0.98220336  1.98066235  2.98398876]
 [ 0.98220336  1.98066235  2.98398876]]

此外，您应该将输入 in 替换为另一个名称，因为它在 Python 中是 keyword :-)。

【讨论】：