使用 tensorflow 对象检测 API 训练自定义模型时出现警告答案

【问题标题】：Warning when training a custom model using tensorflow object detection API使用 tensorflow 对象检测 API 训练自定义模型时出现警告
【发布时间】：2021-09-14 07:16:14
【问题描述】：

我一直在使用下面的 tensorflow 对象检测教程来构建自定义对象检测器。

https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/index.html

我已经按照 google colab 中提供的指令运行了 GPU 支持，然后在 AWS EC2 实例中运行了 GPU 支持。在这两种情况下，我都会收到警告，模型训练会停在那里。

我使用了来自 tensorflow 2 检测模型园的 EfficientDet D6 模型。

下面是停止模型训练的警告。

警告：张量流：检查点中未解析的对象：(root).model._feature_extractor._bifpn_stage.node_input_blocks.7.0.1.1.axis W0910 14:45:44.534728 140520822372160 util.py:203] 检查点中未解析的对象：(root).model._feature_extractor._bifpn_stage.node_input_blocks.7.0.1.1.axis 警告：张量流：检查点中未解析的对象：（根）.model._feature_extractor._bifpn_stage.node_input_blocks.7.0.1.1.gamma W0910 14:45:44.534780 140520822372160 util.py:203] 检查点中未解析的对象：(root).model._feature_extractor._bifpn_stage.node_input_blocks.7.0.1.1.gamma 警告：张量流：检查点中未解析的对象：（根）.model._feature_extractor._bifpn_stage.node_input_blocks.7.0.1.1.beta W0910 14:45:44.534832 140520822372160 util.py:203] 检查点中未解析的对象：(root).model._feature_extractor._bifpn_stage.node_input_blocks.7.0.1.1.beta 警告：张量流：检查点中未解析的对象：（根）.model._feature_extractor._bifpn_stage.node_input_blocks.7.0.1.1.moving_mean W0910 14:45:44.534884 140520822372160 util.py:203] 检查点中未解析的对象：(root).model._feature_extractor._bifpn_stage.node_input_blocks.7.0.1.1.moving_mean 警告：张量流：检查点中未解析的对象：（根）.model._feature_extractor._bifpn_stage.node_input_blocks.7.0.1.1.moving_variance W0910 14:45:44.534937 140520822372160 util.py:203] 检查点中未解析的对象：(root).model._feature_extractor._bifpn_stage.node_input_blocks.7.0.1.1.moving_variance 警告：tensorflow：检查点已恢复（例如 tf.train.Checkpoint.restore 或 tf.keras.Model.load_weights），但并未使用所有检查点值。具体问题见上文。在加载状态对象上使用 expect_partial()，例如tf.train.Checkpoint.restore(...).expect_partial()，以使这些警告静音，或使用 assert_consumed() 使检查明确。有关详细信息，请参阅https://www.tensorflow.org/guide/checkpoint#loading_mechanics。 W0910 14:45:44.534990 140520822372160 util.py:211] 检查点已恢复（例如 tf.train.Checkpoint.restore 或 tf.keras.Model.load_weights

感谢任何帮助或指针。

【问题讨论】：

标签： tensorflow2.0 object-detection-api transfer-learning

【解决方案1】：

在加载状态对象上使用expect_partial()，
例如tf.train.Checkpoint.restore(...).expect_partial()，让这些警告静音，
或使用assert_consumed()。
Official Document

【讨论】：

感谢您的回复@Deep。但是，expect_partial 似乎对加载保存的模型很有用。 model = tf.keras.Model(...) tf.saved_model.save(model, path) # 或者 model.save(path, save_format='tf') checkpoint = tf.train.Checkpoint(model) checkpoint.restore(路径).expect_partial()。这些来自官方指南
我尝试运行 SSD MobileNet V2 FPNLite 320x320，并得到以下错误。 ValueError：无法分配给变量WeightSharedConvolutionalBoxPredictor/WeightSharedConvolutionalClassHead/ClassPredictor/bias：0，因为变量形状（66，）和值形状（18，）不兼容警告：tensorflow：检查点中的未解析对象：（root）.save_counter
出现此错误后，程序指向以下警告并不幸停止。 W0916 07:32:58.886545 140440812377920 util.py:203] 检查点中未解析的对象：(root).save_counter 警告：tensorflow：检查点已恢复（例如 tf.train.Checkpoint.restore 或 tf.keras.Model.load_weights）但是并非所有检查点值都被使用。具体问题见上文。在加载状态对象上使用 expect_partial()，例如tf.train.Checkpoint.restore(...).expect_partial()，使这些警告静音，或使用 assert_consumed() 使检查明确。
W0916 07:32:58.886786 140440812377920 util.py:211] 检查点已恢复（例如 tf.train.Checkpoint.restore 或 tf.keras.Model.load_weights）但并非所有检查点值都已使用.具体问题见上文。在加载状态对象上使用 expect_partial()，例如tf.train.Checkpoint.restore(...).expect_partial()，以使这些警告静音，或使用 assert_consumed() 使检查明确。详情请见tensorflow.org/guide/checkpoint#loading_mechanics。