为什么 Python 中的 CNN 性能比 Matlab 中的差得多？答案

【问题标题】：Why CNN in Python is performing much worse than in Matlab?为什么 Python 中的 CNN 性能比 Matlab 中的差得多？
【发布时间】：2020-03-04 01:22:27
【问题描述】：

我已经在 Matlab 2019b 中训练了一个进行二进制分类的 CNN。当这个 CNN 在测试数据集中进行测试时，它的准确率约为 95%。我使用了exportONNXNetwork 函数，以便可以在 Keras 的 Tensorflow 中实现我的 CNN。这是我在 keras 中使用 ONNX 文件的代码：

import onnx
from onnx_tf.backend import prepare
import numpy as np
from numpy import array
from IPython.display import display
from PIL import Image

onnx_model = onnx.load("model.onnx")
tf_rep = prepare(onnx_model)
img = Image.open("image.jpg").resize((224,224))
img = array(img).reshape(1,3,224,224)
img = img.astype(np.uint8)

classification = tf_rep.run(img)
print(classification)

当在 same 测试数据集上测试此 Python 代码时，它几乎将所有内容都归类为 0 类，少数情况下属于 1 类。我不确定为什么会这样。

【问题讨论】：

您是否对 Matlab 中的测试数据进行了未考虑的任何预处理步骤？您确定PIL.Image 中的图像数据与imread 中的图像数据格式完全相同（例如，8 位无符号整数与浮点数）或您用来获取图像数据的任何 Matlab 函数吗？跨度>
我没有对 Matlab 中的测试数据做任何预处理步骤，它们都使用 uint8 作为格式。

标签： python matlab tensorflow keras conv-neural-network

【解决方案1】：

乍一看，我认为您需要置换图像轴而不是重塑：

img = Image.open("image.jpg").resize((224,224))
img = array(img).transpose(2, 0, 1)
img = np.expand_dims(img, 0)

您从 PIL 获得的图像采用通道最后一种格式，即形状为 (height, width, channels) 的张量，在本例中为 (224, 224, 3)。您的模型需要通道第一格式的输入，即形状为 (channels, height, width) 的张量，在本例中为 (3, 224, 224)。

您需要将最后一个轴移到前面。如果你使用 reshape，NumPy 将按 C 顺序遍历数组（最后一个轴索引变化最快），这意味着你的图像最终会被打乱。举个例子就更容易理解了：

>>> img = np.arange(48).reshape(4, 4, 3)
>>> img[0, 0, :]
array([0, 1, 2])

(0, 0) 像素的 RGB 值为 (0, 1, 2)。如果你使用np.transpose()，这个会被保留：

>>> img.transpose(2, 0, 1)[:, 0, 0]
array([0, 1, 2])

如果你使用 reshape，你的图像会被打乱：

>>> img.reshape(3, 224, 224)[:, 0, 0]
array([0, 16, 32])

【讨论】：

你能解释一下为什么需要这样做吗？
我编辑了答案以包含一个示例，希望对您有所帮助。您可以在here找到更多关于频道优先与频道最后的详细信息。