使用 Keras 获取 LSTM 网络的 Cell、Input Gate、Output Gate 和 Forget Gate 激活值答案

【问题标题】：Get Cell, Input Gate, Output Gate and Forget Gate activation values for LSTM network using Keras使用 Keras 获取 LSTM 网络的 Cell、Input Gate、Output Gate 和 Forget Gate 激活值
【发布时间】：2019-06-11 16:42:33
【问题描述】：

我想获取经过训练的 LSTM 网络的给定输入的激活值，特别是单元格、输入门、输出门和遗忘门的值。根据这个 Keras issue 和这个 Stackoverflow question 我可以使用以下代码获得一些激活值：

（基本上我正在尝试使用每个时间序列一个标签对一维时间序列进行分类，但这对于这个一般性问题并不重要）

import random
from pprint import pprint

import keras.backend as K
import numpy as np
from keras.layers import Dense
from keras.layers.recurrent import LSTM
from keras.models import Sequential
from keras.utils import to_categorical

def getOutputLayer(layerNumber, model, X):
    return K.function([model.layers[0].input],
                      [model.layers[layerNumber].output])([X])

model = Sequential()
model.add(LSTM(10, batch_input_shape=(1, 1, 1), stateful=True))
model.add(Dense(2, activation='softmax'))
model.compile(
    loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')

# generate some test data
for i in range(10):
    # generate a random timeseries of 100 numbers
    X = np.random.rand(10)
    X = X.reshape(10, 1, 1)

    # generate a random label for the whole timeseries between 0 and 1
    y = to_categorical([random.randint(0, 1)] * 10, num_classes=2)

    # train the lstm for this one timeseries
    model.fit(X, y, epochs=1, batch_size=1, verbose=0)
    model.reset_states()

# to keep the output simple use only 5 steps for the input of the timeseries
X_test = np.random.rand(5)
X_test = X_test.reshape(5, 1, 1)

# get the activations for the output lstm layer
pprint(getOutputLayer(0, model, X_test))

使用我得到 LSTM 层的以下激活值：

[array([[-0.04106992, -0.00327154, -0.01524276,  0.0055838 ,  0.00969929,
        -0.01438944,  0.00211149, -0.04286387, -0.01102304,  0.0113989 ],
       [-0.05771339, -0.00425535, -0.02032563,  0.00751972,  0.01377549,
        -0.02027745,  0.00268653, -0.06011265, -0.01602218,  0.01571197],
       [-0.03069103, -0.00267129, -0.01183739,  0.00434298,  0.00710012,
        -0.01082268,  0.00175544, -0.0318702 , -0.00820942,  0.00871707],
       [-0.02062054, -0.00209525, -0.00834482,  0.00310852,  0.0045242 ,
        -0.00741894,  0.00141046, -0.02104726, -0.0056723 ,  0.00611038],
       [-0.05246543, -0.0039417 , -0.01877101,  0.00691551,  0.01250046,
        -0.01839472,  0.00250443, -0.05472757, -0.01437504,  0.01434854]],
      dtype=float32)]

所以我为每个输入值得到 10 个值，因为我在 Keras 模型中指定使用具有 10 个神经元的 LSTM。但是哪个是单元格，哪个是输入门，哪个是输出门，哪个是遗忘门？

【问题讨论】：

标签： python tensorflow keras deep-learning keras-layer

【解决方案1】：

嗯，这些是输出值，要获取并查看每个门的值，请查看issue

我把重要的部分贴在这里

for i in range(epochs):
    print('Epoch', i, '/', epochs)
    model.fit(cos,
              expected_output,
              batch_size=batch_size,
              verbose=1,
              nb_epoch=1,
              shuffle=False)

    for layer in model.layers:
        if 'LSTM' in str(layer):
            print('states[0] = {}'.format(K.get_value(layer.states[0])))
            print('states[1] = {}'.format(K.get_value(layer.states[1])))

            print('Input')
            print('b_i = {}'.format(K.get_value(layer.b_i)))
            print('W_i = {}'.format(K.get_value(layer.W_i)))
            print('U_i = {}'.format(K.get_value(layer.U_i)))

            print('Forget')
            print('b_f = {}'.format(K.get_value(layer.b_f)))
            print('W_f = {}'.format(K.get_value(layer.W_f)))
            print('U_f = {}'.format(K.get_value(layer.U_f)))

            print('Cell')
            print('b_c = {}'.format(K.get_value(layer.b_c)))
            print('W_c = {}'.format(K.get_value(layer.W_c)))
            print('U_c = {}'.format(K.get_value(layer.U_c)))

            print('Output')
            print('b_o = {}'.format(K.get_value(layer.b_o)))
            print('W_o = {}'.format(K.get_value(layer.W_o)))
            print('U_o = {}'.format(K.get_value(layer.U_o)))

    # output of the first batch value of the batch after the first fit().
    first_batch_element = np.expand_dims(cos[0], axis=1)  # (1, 1) to (1, 1, 1)
    print('output = {}'.format(get_LSTM_output([first_batch_element])[0].flatten()))

    model.reset_states()

print('Predicting')
predicted_output = model.predict(cos, batch_size=batch_size)

print('Ploting Results')
plt.subplot(2, 1, 1)
plt.plot(expected_output)
plt.title('Expected')
plt.subplot(2, 1, 2)
plt.plot(predicted_output)
plt.title('Predicted')
plt.show()

【讨论】：