使用 tensorflow 提取的 CNN 特征大多为零答案

【问题标题】：the extracted CNN features using tensorflow are mostly zeros使用 tensorflow 提取的 CNN 特征大多为零
【发布时间】：2018-05-17 02:22:25
【问题描述】：

我使用 TensorFlow 训练了一个 CNN 模型。之后我输出并保存 fc1 层的特征，但我发现大部分特征都是零。

我的模型如下。我使用 h_fc1 层作为特征。训练和测试似乎很好，但我不明白为什么提取的特征大多是零，这是正常的，还是我犯了一些错误？我怀疑输入图像如何用这种稀疏特征表示。任何建议或提示将不胜感激，谢谢

def get_model(x):
# First convolutional layer - maps one grayscale image to 32 feature maps.
W_conv1 = weight_variable([3, 3, 1, 32])
b_conv1 = bias_variable([32])
h_conv1 = tf.nn.relu(conv2d(x, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

# Second convolutional layer -- maps 32 feature maps to 64.
W_conv2 = weight_variable([3, 3, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

# Third convolutional layer -- maps 64 feature maps to 128.
W_conv3 = weight_variable([3, 3, 64, 128])
b_conv3 = bias_variable([128])
h_conv3 = tf.nn.relu(conv2d(h_pool2, W_conv3) + b_conv3)
h_pool3 = max_pool_2x2(h_conv3)

# Fourth convolutional layer -- maps 128 feature maps to 256.
W_conv4 = weight_variable([3, 3, 128, 256])
b_conv4 = bias_variable([256])
h_conv4 = tf.nn.relu(conv2d(h_pool3, W_conv4) + b_conv4)
h_pool4 = max_pool_2x2(h_conv4)


# Fully connected layer 1
# is down to 4x4x256 feature maps -- maps this to 1024 features.
W_fc1 = weight_variable([4 * 4 * 256, 1024])
b_fc1 = bias_variable([1024])

h_pool2_flat = tf.reshape(h_pool4, [-1, 4*4*256])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

# Dropout - controls the complexity of the model, prevents co-adaptation of
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

# Map the 1024 features to 10 classes, one for each digit
W_fc2 = weight_variable([1024, 512])
b_fc2 = bias_variable([512])
#h_pool2_flat = tf.reshape(h_pool5, [-1, 4*4*512])
h_fc2 = tf.nn.relu(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
h_fc2_drop = tf.nn.dropout(h_fc2, keep_prob)

# Map the 1024 features to 10 classes, one for each digit
W_fc3 = weight_variable([512, FLAGS.nClasses])
b_fc3 = bias_variable([FLAGS.nClasses])

y_conv = tf.matmul(h_fc2_drop, W_fc3) + b_fc3
# here, h_fc1 as the output features.
return y_conv, keep_prob, h_fc1

【问题讨论】：

我找到了原因，因为 Relu 激活函数会将任何值
受过训练的网络的激活是多种因素的函数：训练数据、网络实现的功能、实际图像或推理过程中使用的图像集等。期待并非没有道理在某些层有很多零，特别是给 Relu 激活。

标签： tensorflow deep-learning feature-extraction

【解决方案1】：

是的，你是对的，这很可能是因为 ReLU。但是，我仍然建议使用 ReLU 之后的功能。

不过，我认为同时使用这两种方法并看看哪一种效果更好是一个很好的实验。祝你好运；）

【讨论】：