【问题标题】:remove errored arrays from training arrays从训练数组中删除错误数组
【发布时间】:2021-12-06 19:40:21
【问题描述】:

我正在尝试训练用于图像分类的顺序模型。 在探索了生成的 X_train、y_train 数组后,我发现 X_train 中的一些数组是空的 -> 在尝试运行 fit_generator 时出现 ValueError: setting an array element with a sequence。

X_train 形状为 (2122,),y_train 形状为 (2122, 28)。

如何在知道索引的情况下安全地从 X_train 和 y_train 数组中删除空对象?

X_train 由以下函数生成:

def normalize_image(image):
    try:
        return np.array(cv2.resize(image, (img_size, img_size))).astype("float32") / 255.0        
    except Exception as e:
        pass

 X = np.array([normalize_image(img_data.get_pixels()) for img_data in imgs_data])

看起来像这样:

array([array([[[0.80784315, 0.84313726, 0.8784314 ],
    [0.80784315, 0.84313726, 0.8745098 ],
    [0.8039216 , 0.8509804 , 0.8666667 ],
    ...,
    [0.77254903, 0.78431374, 0.8039216 ],
    [0.7764706 , 0.7882353 , 0.80784315],
    [0.78039217, 0.7921569 , 0.8117647 ]],

   [[0.80784315, 0.84313726, 0.8784314 ],
    [0.80784315, 0.84313726, 0.8745098 ],
    [0.8039216 , 0.8509804 , 0.8666667 ],
    ...,
    [0.77254903, 0.78431374, 0.8039216 ],
    [0.76862746, 0.78039217, 0.8       ],
    [0.77254903, 0.78431374, 0.8039216 ]],

   [[0.80784315, 0.84313726, 0.8784314 ],
    [0.80784315, 0.84313726, 0.8745098 ],
    [0.8039216 , 0.8509804 , 0.8666667 ],
    ...,
    [0.77254903, 0.78431374, 0.8039216 ],
    [0.7764706 , 0.7882353 , 0.80784315],
    [0.77254903, 0.7882353 , 0.8039216 ]],

   ...,

   [[0.7921569 , 0.80784315, 0.8509804 ],
    [0.79607844, 0.8117647 , 0.85490197],
    [0.79607844, 0.8117647 , 0.85490197],
    ...,
    [0.23529412, 0.39607844, 0.5254902 ],
    [0.24313726, 0.39215687, 0.5294118 ],
    [0.23921569, 0.3882353 , 0.5254902 ]],

   [[0.7921569 , 0.80784315, 0.8509804 ],
    [0.79607844, 0.8117647 , 0.85490197],
    [0.79607844, 0.8117647 , 0.85490197],
    ...,
    [0.23529412, 0.39607844, 0.5254902 ],
    [0.24313726, 0.39215687, 0.5294118 ],
    [0.23137255, 0.38039216, 0.5176471 ]],

   [[0.7921569 , 0.80784315, 0.8509804 ],
    [0.79607844, 0.8117647 , 0.85490197],
    [0.79607844, 0.8117647 , 0.85490197],
    ...,
    [0.22352941, 0.38431373, 0.5137255 ],
    [0.24313726, 0.39215687, 0.5294118 ],
    [0.24313726, 0.39215687, 0.5294118 ]]], dtype=float32),
    ...
    [[0., 0., 0.],
    [0., 0., 0.],
    [0., 0., 0.],
    ...,
    [0., 0., 0.],
    [0., 0., 0.],
    [0., 0., 0.]],

   [[0., 0., 0.],
    [0., 0., 0.],
    [0., 0., 0.],
    ...,
    [0., 0., 0.],
    [0., 0., 0.],
    [0., 0., 0.]],

   [[0., 0., 0.],
    [0., 0., 0.],
    [0., 0., 0.],
    ...,
    [0., 0., 0.],
    [0., 0., 0.],
    [0., 0., 0.]]], dtype=float32)], dtype=object)

【问题讨论】:

    标签: python numpy


    【解决方案1】:

    只需执行以下操作:

    X = np.array([v for v in [normalize_image(img_data.get_pixels()) for img_data in imgs_data] if np.sum(v[0]) > 0])
    

    这应该排除所有只有零的图像。

    编辑:

    我猜你仍然需要为 y 做同样的事情,所以你可以同时为两者做以下事情:

    pairs = [pair for pair in [(normalize_image(x.get_pixels()), y) for x, y in zip(imgs_data, Y)] if np.sum(pair[0]) > 0]
    

    之后,您可以通过以下方式创建 X 和 Y:

    X = np.array([x[0] for x in pairs])
    Y = np.array([x[1] for x in pairs])
    

    【讨论】:

      猜你喜欢
      • 2018-01-24
      • 2021-01-16
      • 1970-01-01
      • 1970-01-01
      • 2019-03-18
      • 1970-01-01
      • 2013-10-28
      • 2019-06-28
      相关资源
      最近更新 更多