【问题标题】:TF2: Compute gradients in keras callback in non-eager modeTF2:在非急切模式下计算 keras 回调中的梯度
【发布时间】:2020-05-03 01:51:46
【问题描述】:

TF 版本:2.2.0-rc3(在 Colab 中)

我在回调中使用以下代码(来自tf.keras get computed gradient during training)来计算模型中所有参数的梯度。

def on_train_begin(self, logs=None):
        # Functions return weights of each layer
        self.layerweights = []
        for lndx, l in enumerate(self.model.layers):
            if hasattr(l, 'kernel'):
                self.layerweights.append(l.kernel)

        input_tensors = [self.model.inputs[0],
                        self.model.sample_weights[0],
                        self.model.targets[0],
                        K.learning_phase()]

        # Get gradients of all the relevant layers at once
        grads = self.model.optimizer.get_gradients(self.model.total_loss, self.layerweights)
        self.get_gradients = K.function(inputs=input_tensors,outputs=grads)

但是,当我运行它时,我收到以下错误。

AttributeError: 'Model' object has no attribute 'sample_weights'

model.targets 也出现同样的错误。

如何在回调中获取渐变?

在 Eager 模式下,解决方案 Get Gradients with Keras Tensorflow 2.0 有效。但是,我想在 Non-eager 模式下使用它。

【问题讨论】:

    标签: tensorflow google-colaboratory tensorflow2.0


    【解决方案1】:

    这是使用 keras 后端捕获渐变的端到端代码。我已经从 model.fit 的回调中调用梯度捕获函数来捕获每个时期结束后的梯度。 此代码在 tensorflow 1.x 和 tensorflow 2.x 版本中都兼容,并且我已经在 colab 中运行过。 如果您想在 tensorflow 1.x 中运行,请将第一个语句替换为带有%tensorflow_version 1.x 的程序并重新启动运行时。

    捕捉模型的梯度 -

    # Importing dependency
    %tensorflow_version 2.x
    from tensorflow import keras
    from tensorflow.keras import backend as K
    from tensorflow.keras import datasets
    from tensorflow.keras.models import Sequential
    from tensorflow.keras.layers import Dense, Activation, Dropout, Flatten, Conv2D, MaxPooling2D
    from tensorflow.keras.layers import BatchNormalization
    import numpy as np
    import tensorflow as tf
    
    tf.keras.backend.clear_session()  # For easy reset of notebook state.
    tf.compat.v1.disable_eager_execution()
    
    # Import Data
    (train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()
    
    # Build Model
    model = Sequential()
    model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
    model.add(MaxPooling2D((2, 2)))
    model.add(Conv2D(64, (3, 3), activation='relu'))
    model.add(MaxPooling2D((2, 2)))
    model.add(Conv2D(64, (3, 3), activation='relu'))
    model.add(Flatten())
    model.add(Dense(64, activation='relu'))
    model.add(Dense(10))
    
    # Model Summary
    model.summary()
    
    # Model Compile 
    model.compile(optimizer='adam',
                  loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
                  metrics=['accuracy'])
    
    # Define the Gradient Fucntion
    epoch_gradient = []
    
    # Define the Gradient Function
    def get_gradient_func(model):
        grads = K.gradients(model.total_loss, model.trainable_weights)
        inputs = model._feed_inputs + model._feed_targets + model._feed_sample_weights
        func = K.function(inputs, grads)
        return func
    
    # Define the Required Callback Function
    class GradientCalcCallback(keras.callbacks.Callback):
      def on_epoch_end(self, epoch, logs=None):
          get_gradient = get_gradient_func(model)
          grads = get_gradient([train_images, train_labels, np.ones(len(train_labels))])
          epoch_gradient.append(grads)
    
    epoch = 4
    
    model.fit(train_images, train_labels, epochs=epoch, validation_data=(test_images, test_labels), callbacks=[GradientCalcCallback()])
    
    
    # (7) Convert to a 2 dimensiaonal array of (epoch, gradients) type
    gradient = np.asarray(epoch_gradient)
    print("Total number of epochs run:", epoch)
    print("Gradient Array has the shape:",gradient.shape)
    

    输出 -

    Model: "sequential"
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    conv2d (Conv2D)              (None, 30, 30, 32)        896       
    _________________________________________________________________
    max_pooling2d (MaxPooling2D) (None, 15, 15, 32)        0         
    _________________________________________________________________
    conv2d_1 (Conv2D)            (None, 13, 13, 64)        18496     
    _________________________________________________________________
    max_pooling2d_1 (MaxPooling2 (None, 6, 6, 64)          0         
    _________________________________________________________________
    conv2d_2 (Conv2D)            (None, 4, 4, 64)          36928     
    _________________________________________________________________
    flatten (Flatten)            (None, 1024)              0         
    _________________________________________________________________
    dense (Dense)                (None, 64)                65600     
    _________________________________________________________________
    dense_1 (Dense)              (None, 10)                650       
    =================================================================
    Total params: 122,570
    Trainable params: 122,570
    Non-trainable params: 0
    _________________________________________________________________
    Train on 50000 samples, validate on 10000 samples
    Epoch 1/4
    50000/50000 [==============================] - 73s 1ms/sample - loss: 1.8199 - accuracy: 0.3834 - val_loss: 1.4791 - val_accuracy: 0.4548
    Epoch 2/4
    50000/50000 [==============================] - 357s 7ms/sample - loss: 1.3590 - accuracy: 0.5124 - val_loss: 1.2661 - val_accuracy: 0.5520
    Epoch 3/4
    50000/50000 [==============================] - 377s 8ms/sample - loss: 1.1981 - accuracy: 0.5787 - val_loss: 1.2625 - val_accuracy: 0.5674
    Epoch 4/4
    50000/50000 [==============================] - 345s 7ms/sample - loss: 1.0838 - accuracy: 0.6183 - val_loss: 1.1302 - val_accuracy: 0.6083
    Total number of epochs run: 4
    Gradient Array has the shape: (4, 10)
    

    希望这能回答您的问题。快乐学习。

    【讨论】:

    • 嗨,非常感谢您的解决方案。对于 TF2 中的 Keras,它需要一些修改。我已经制作了一个完整的工作示例的笔记本以供将来参考 - colab.research.google.com/drive/…
    • @v-i-s-h 感谢您分享参考资料。我已经对从 tensorflow 导入 keras 的答案进行了必要的更改。快乐学习。
    • 不会在每个批次结束时调用 get_gradient_func 创建新节点(每次调用)?
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2023-04-11
    • 1970-01-01
    • 2020-05-15
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2017-12-06
    相关资源
    最近更新 更多