在 MPII Human Pose 数据集上训练 Keras 分类器答案

【问题标题】：Training a Keras classifier on MPII Human Pose dataset在 MPII Human Pose 数据集上训练 Keras 分类器
【发布时间】：2018-08-08 13:34:29
【问题描述】：

我正在尝试使用 MPII 人体姿势数据集（找到 here）在 Keras 中训练神经网络。默认情况下，数据集采用 MATLAB 格式，但我使用 scipy.io.loadmat 将其加载到 Numpy 数组中。但是，我无法理解这产生的对象 - 它似乎包含一个名为 'RELEASE' 的键和数据集的注释作为值。我的问题是我不知道如何访问数据集并将其拆分为注释。

非常感谢您对这个问题的帮助。

【问题讨论】：

标签： python machine-learning keras computer-vision

【解决方案1】：

MPII 数据 annotations.mat 文件来自 matlab，它是 matlab 中的结构类型，所以如果你想使用 scipy.io.loadmat 处理它，你应该添加这样的参数：

matph = './mpii_human_pose_v1_u12_1.mat'
mat = sio.loadmat(matph, struct_as_record=False) # add here

让我们打印垫子：

{'RELEASE': array([[<scipy.io.matlab.mio5_params.mat_struct object at 0x7f7b0ba51790>]],
  dtype=object), '__version__': '1.0', '__header__': 'MATLAB 5.0 MAT-file, Platform: GLNXA64, Created on: Tue Sep 23 22:09:02 2014', '__globals__': []}

它是一个字典，所以我们得到它的值：

release = mat['RELEASE']

让我们打印发布的一些属性：

print(release, type(release), release.shape)
(array([[<scipy.io.matlab.mio5_params.mat_struct object at 0x7fd407de1790>]],
      dtype=object), <type 'numpy.ndarray'>, (1, 1))

release是一个数组，其元素是scipy.io.matlab.mio5_params.mat_struct对象，这里我们可以使用对象的两种方法：__dict__和_fieldnames这样：

object1 = release[0,0]
print(object1._fieldnames)
['annolist', 'img_train', 'version', 'single_person', 'act', 'video_list']
annolist = object1.__dict__['annolist']
print(annolist, type(annolist), annolist.shape)
(array([[<scipy.io.matlab.mio5_params.mat_struct object at 0x7fd407de1810>,
        <scipy.io.matlab.mio5_params.mat_struct object at 0x7fd407de1850>,
        <scipy.io.matlab.mio5_params.mat_struct object at 0x7fd407de1910>,
        ...,
        <scipy.io.matlab.mio5_params.mat_struct object at 0x7fd3db7f1710>,
        <scipy.io.matlab.mio5_params.mat_struct object at 0x7fd3db7f1c50>,
        <scipy.io.matlab.mio5_params.mat_struct object at 0x7fd3db794490>]],
      dtype=object), <type 'numpy.ndarray'>, (1, 24987))

我们得到一个包含 24987 个元素的数组，其中也是 scipy.io.matlab.mio5_params.mat_struct 对象。那么我们可以继续研究：

anno1 = annolist[0,0]
print(anno1._fieldnames)
['image', 'annorect', 'frame_sec', 'vididx']
annorect = anno1.__dict__['annorect']
print(annorect, type(annorect), annorect.shape)
(array([[<scipy.io.matlab.mio5_params.mat_struct object at 0x7fd407de19d0>,
        <scipy.io.matlab.mio5_params.mat_struct object at 0x7fd407de1710>]],
      dtype=object), <type 'numpy.ndarray'>, (1, 2))
anno2 = annorect[0,0]
print(anno2._fieldnames)
['scale', 'objpos']
objpos = anno2.__dict__['objpos']
print(objpos, type(objpos), objpos.shape)
(array([[<scipy.io.matlab.mio5_params.mat_struct object at 0x7fd398204b90>]],
      dtype=object), <type 'numpy.ndarray'>, (1, 1))
objpos1 = objpos[0,0]
print(objpos1._fieldnames)
['x', 'y']
y = objpos1.__dict__['y']
print(y, type(y), y.shape)
(array([[210]], dtype=uint8), <type 'numpy.ndarray'>, (1, 1))

【讨论】：

【解决方案2】：

受 bobxxxl 的回答启发，我编写了一个简单的函数来将对象转换为 dict 格式，以及一个 print_dataset_obj 方法来可视化它。希望对您有所帮助。


decoded1 = scipy.io.loadmat(mat_path, struct_as_record=False)["RELEASE"]

must_be_list_fields = ["annolist", "annorect", "point", "img_train", "single_person", "act", "video_list"]

def generate_dataset_obj(obj):
    if type(obj) == np.ndarray:
        dim = obj.shape[0]
        if dim == 1:
            ret = generate_dataset_obj(obj[0])
        else:
            ret = []
            for i in range(dim):
                ret.append(generate_dataset_obj(obj[i]))

    elif type(obj) == scipy.io.matlab.mio5_params.mat_struct:
        ret = {}
        for field_name in obj._fieldnames:
            field = generate_dataset_obj(obj.__dict__[field_name])
            if field_name in must_be_list_fields and type(field) != list:
                field = [field]
            ret[field_name] = field

    else:
        ret = obj

    return ret

def print_dataset_obj(obj, depth = 0, maxIterInArray = 20):
    prefix = "  "*depth
    if type(obj) == dict:
        for key in obj.keys():
            print("{}{}".format(prefix, key))
            print_dataset_obj(obj[key], depth + 1)
    elif type(obj) == list:
        for i, value in enumerate(obj):
            if i >= maxIterInArray:
                break
            print("{}{}".format(prefix, i))
            print_dataset_obj(value, depth + 1)
    else:
        print("{}{}".format(prefix, obj))

# Convert to dict
dataset_obj = generate_dataset_obj(decoded1)

# Print it out
print_dataset_obj(dataset_obj)

【讨论】：