【问题标题】:CNN doesn't learn simple geometric patternsCNN 不学习简单的几何图案
【发布时间】:2019-05-11 13:15:26
【问题描述】:

这一定是个很愚蠢的问题,但由于我没有足够的知识储备,也没有更多时间去寻找答案,所以我不得不把它放在这里寻求帮助。我通过程序生成了一个包含三角形、正方形、菱形等简单几何形状图像的训练数据集,并构建了一个具有两个卷积层和一个池化层的 CNN,以及一个最终的全连接层来学习这些形状的分类。但是网络就是不学习它。我的意思是损失并没有减少。是什么原因?

在 Caffe 中,神经网络配置文件“very_simple_one.prototxt”如下所示:

name: "very_simple_one"
layer {
  ##name: "input"
  name: "data"
  ##type: "Input"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mean_file: "images/train_valid_lmdb_mean.binaryproto"
  }
  data_param {
    source: "images/train_valid_lmdb"
    batch_size: 1000
    backend: LMDB
  }
  input_param {
    shape {
      dim: 1
      dim: 3
      dim: 200
      dim: 200
    }
  }
}
layer {
  ##name: "input"
  name: "data"
  ##type: "Input"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    mean_file: "images/train_valid_lmdb_mean.binaryproto"
  }
  data_param {
    source: "images/test_lmdb"
    batch_size: 100
    backend: LMDB
  }
  input_param {
    shape {
      dim: 1
      dim: 3
      dim: 200
      dim: 200
    }
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 5
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 5
    stride: 5
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  convolution_param {
    num_output: 3
    kernel_size: 8
    stride: 8
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "fc3"
  type: "InnerProduct"
  bottom: "conv2"
  top: "fc3"
  inner_product_param {
    num_output: 3
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc3"
  bottom: "label"
}

“solver.prototxt”看起来像:

net: "very_simple_one.prototxt"
type: "SGD"
test_iter: 15
test_interval: 100
base_lr: 0.05
lr_policy: "step"
gamma: 0.9999
stepsize: 100
display: 20
max_iter: 50000
snapshot: 2000
momentum: 0.9
weight_decay: 0.00000000000
solver_mode: GPU

还通过注释“动量”并将“类型”修改为 AdaGrad 来尝试 AdaGrad。 通过命令训练这个网络:

....../caffe/build/tools/caffe train -solver solver.prototxt

全部训练失败。我的意思是损失并没有减少。损失在一个非常小的区间内徘徊,但从未真正减少。

只是想知道数据集是否肯定无法训练或者我的配置文件有问题,上面的那些?

我还根据 Ibrahim Yousuf 所说的修改了网络,将池化层替换为卷积层:

name: "very_simple_one"
layer {
  ##name: "input"
  name: "data"
  ##type: "Input"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mean_file: "images/train_valid_lmdb_mean.binaryproto"
  }
  data_param {
    source: "images/train_valid_lmdb"
    batch_size: 1000
    backend: LMDB
  }
  input_param {
    shape {
      dim: 1
      dim: 3
      dim: 200
      dim: 200
    }
  }
}
layer {
  ##name: "input"
  name: "data"
  ##type: "Input"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    mean_file: "images/train_valid_lmdb_mean.binaryproto"
  }
  data_param {
    source: "images/test_lmdb"
    batch_size: 100
    backend: LMDB
  }
  input_param {
    shape {
      dim: 1
      dim: 3
      dim: 200
      dim: 200
    }
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  convolution_param {
    num_output: 50
    kernel_size: 5
    ##stride: 5
    stride: 2
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "conv1.5"
  type: "Convolution"
  bottom: "conv1"
  top: "conv1.5"
  convolution_param {
    num_output: 10
    kernel_size: 5
    stride: 2
  }
}
layer {
  name: "relu1.5"
  type: "ReLU"
  bottom: "conv1.5"
  top: "conv1.5"
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "conv1.5"
  top: "conv2"
  convolution_param {
    num_output: 3
    kernel_size: 8
    stride: 4
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "fc3"
  type: "InnerProduct"
  bottom: "conv2"
  top: "fc3"
  inner_product_param {
    num_output: 3
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc3"
  bottom: "label"
}

但是损失仍然没有减少。我应该确认原因是我的数据集吗?而且我的数据集真的很小,如果有人能帮帮我,我可以把它上传到某个网盘上下载测试。

【问题讨论】:

  • 移除池化层,并首先尝试对一小部分数据进行过度拟合,以衡量您网络的能力。
  • @Ibrahim Yousuf 非常感谢!但是您能解释一下为什么要删除池化层吗?
  • 既然你说网络不学习。我会移除池化层,并尝试过度拟合数据集的一小部分。如果它过拟合,这意味着网络的制定没有问题。
  • @Ibrahim Yousuf 我更新了我的帖子并用卷积层替换了池化层,但没用。请关注我的帖子好吗?
  • 否,但数据是每个类的单个示例,变化是空间维度。并且 CNN 对空间变化是不变的。因此,您需要更多具有几何对象大小变化、旋转以及背景变化的数据。

标签: caffe


【解决方案1】:

解决了。分类标签应该从零开始,而不是从一开始,例如0, 1 ,2 对于三分类问题不是 1, 2, 3。

【讨论】:

    猜你喜欢
    • 2021-04-15
    • 1970-01-01
    • 2020-09-17
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-12-19
    • 1970-01-01
    • 2020-07-21
    相关资源
    最近更新 更多