【问题标题】:Adding multiple classes in Mask R-CNN在 Mask R-CNN 中添加多个类
【发布时间】:2020-05-05 18:16:18
【问题描述】:

我正在使用 Matterport Mask RCNN 作为我的模型,并且我正在尝试构建我的数据库以进行训练。经过对以下问题的深思熟虑,我想我实际上要问的是如何添加多个类(+ BG)?

我收到以下AssertionError

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-21-c20768952b65> in <module>()
     15 
     16   # display image with masks and bounding boxes
---> 17   display_instances(image, bbox, masks, class_ids/4, train_set.class_names)

/usr/local/lib/python3.6/dist-packages/mask_rcnn-2.1-py3.6.egg/mrcnn/visualize.py in display_instances(image, boxes, masks, class_ids, class_names, scores, title, figsize, ax, show_mask, show_bbox, colors, captions)
    103         print("\n*** No instances to display *** \n")
    104     else:
--> 105         assert boxes.shape[0] == masks.shape[-1] == class_ids.shape[0]
    106 
    107     # If no axis is passed, create one and automatically call show()

AssertionError: 

问题似乎来自 mask.shape[-1] == class_ids.shape[0] 导致 False 不应该是这种情况。

我现在追溯到masks.shape[-1]class_id.shape[0] 值的4 倍,我认为这可能与数据中有4 个类有关。不幸的是,我还没有想出如何解决这个问题。

# load the masks for an image
def load_mask(self, image_id):
  # get details of image
  info = self.image_info[image_id]
  # define box file location
  path = info['annotation']
  # load XML
  boxes, w, h = self.extract_boxes(path)
  # create one array for all masks, each on a different channel
  masks = zeros([h, w, len(boxes)], dtype='uint8')
  # create masks
  class_ids = list()
  for i in range(len(boxes)):
    box = boxes[i]
    row_s, row_e = box[1], box[3]
    col_s, col_e = box[0], box[2]
    masks[row_s:row_e, col_s:col_e, i] = 1
    class_ids.append(self.class_names.index('Resistor'))
    class_ids.append(self.class_names.index('LED'))
    class_ids.append(self.class_names.index('Capacitor'))
    class_ids.append(self.class_names.index('Diode'))
    return masks, asarray(class_ids, dtype='int32')

# load the masks and the class ids
mask, class_ids = train_set.load_mask(image_id)
print(mask, "and", class_ids)

# display image with masks and bounding boxes
display_instances(image, bbox, mask, class_ids, train_set.class_names)

【问题讨论】:

  • 您是否验证了masks.shape[-1] == class_ids.shape[0] 对您的输入有效?
  • 请将您的问题减少到您作为更新提供的minimal reproducible example。调试这个小例子比调试完整代码更容易。
  • @IonicSolutions 感谢您的回复,对于您的第一条评论,我收到了False。为冗长的代码道歉,我会减少它(老实说,我不是 100% 确定是什么部分导致它)
  • 不用道歉!现在你知道为什么断言失败了。您应该检查display_instances 期望maskclass_ids 的格式。

标签: python-3.x tensorflow tensorflow-datasets transfer-learning faster-rcnn


【解决方案1】:

如果你想训练多个类,你可以使用下面的代码..

  1. 在加载数据集中,在self.add_class("class_name")中添加类,然后修改最后一行添加class_ids。 #您拥有的课程数量。

     # define classes
     self.add_class("dataset", 1, "class1name")
     self.add_class("dataset", 2, "class2name")
     # define data locations
     images_dir = dataset_dir + '/images/'
     annotations_dir = dataset_dir + '/annots/'
     # find all images
     for filename in listdir(images_dir):
         # extract image id
         image_id = filename[:-4]
         # skip bad images
         if image_id in ['00090']:
             continue
         # skip all images after 150 if we are building the train set
         if is_train and int(image_id) >= 150:
             continue
         # skip all images before 150 if we are building the test/val set
         if not is_train and int(image_id) < 150:
             continue
         img_path = images_dir + filename
         ann_path = annotations_dir + image_id + '.xml'
         # add to dataset
         self.add_image('dataset', image_id=image_id, path=img_path, annotation=ann_path,class_ids=[0,1,2])
    
  2. 你不需要在下面的函数中修改任何东西

     def extract_boxes(self, filename):
         # load and parse the file
         tree = ElementTree.parse(filename)
         # get the root of the document
         root = tree.getroot()
         # extract each bounding box
         boxes = list()
         for box in root.findall('.//bndbox'):
             xmin = int(box.find('xmin').text)
             ymin = int(box.find('ymin').text)
             xmax = int(box.find('xmax').text)
             ymax = int(box.find('ymax').text)
             coors = [xmin, ymin, xmax, ymax]
             boxes.append(coors)
         # extract image dimensions
         width = int(root.find('.//size/width').text)
         height = int(root.find('.//size/height').text)
     return boxes, width, height
    

3)在下面的函数中“if i == 0”表示第一个边界框。对于多个边界框(即多个类)使用 i == 1,i == 2 .....

    # load the masks for an image
def load_mask(self, image_id):
    # get details of image
    info = self.image_info[image_id]
    # define box file location
    path = info['annotation']
    # load XML
    boxes, w, h = self.extract_boxes(path)
    # create one array for all masks, each on a different channel
    masks = zeros([h, w, len(boxes)], dtype='uint8')
    # create masks
    class_ids = list()
    for i in range(len(boxes)):
        box = boxes[i]
        row_s, row_e = box[1], box[3]
        col_s, col_e = box[0], box[2]
        # print()
        if i == 0:
            masks[row_s:row_e, col_s:col_e, i] = 1
            class_ids.append(self.class_names.index('class1name'))
        else:
            masks[row_s:row_e, col_s:col_e, i] = 2
            class_ids.append(self.class_names.index('class2name'))
    # return boxes[0],masks, asarray(class_ids, dtype='int32') to check the points
    return masks, asarray(class_ids, dtype='int32')

【讨论】:

    【解决方案2】:

    添加多个类需要进行一些修改:

    1) 在加载数据集中,在 self.add_class("class_name") 中添加类,然后 最后一行修改为添加class_ids。 #您拥有的课程数量。

    # load the dataset definitions
    def load_dataset(self, dataset_dir, is_train=True):
        # define one class
        self.add_class("dataset", 1, "car")
        self.add_class("dataset", 2, "rider")
        # define data locations
        images_dir = dataset_dir + '/images_mod/'
        annotations_dir = dataset_dir + '/annots_mod/'
        # find all images
        for filename in listdir(images_dir):
            # extract image id
            image_id = filename[:-4]
            # skip all images after 150 if we are building the train set
            if is_train and int(image_id) >= 3000:
                continue
            # skip all images before 150 if we are building the test/val set
            if not is_train and int(image_id) < 3000:
                continue
            img_path = images_dir + filename
            ann_path = annotations_dir + image_id + '.xml'
            # add to dataset
            self.add_image('dataset', image_id=image_id, path=img_path, annotation=ann_path, class_ids=[0,1,2])
    

    2) 现在,在提取框中,您需要修改以找到对象,然后查找名称和边界框尺寸。如果您有 2 个类并且您的 XML 文件只包含那些确切的类,那么您不需要使用 if 语句将坐标附加到框。但是,如果您想考虑与 XML 文件中可用的类相比更少的类,那么您需要添加 if 语句。否则,所有框都将被视为掩码。

    # extract bounding boxes from an annotation file
    def extract_boxes(self, filename):
        # load and parse the file
        tree = ElementTree.parse(filename)
        # get the root of the document
        root = tree.getroot()
        # extract each bounding box
        boxes = list()
    
        for box in root.findall('.//object'):
            name = box.find('name').text
            xmin = int(box.find('./bndbox/xmin').text)
            ymin = int(box.find('./bndbox/ymin').text)
            xmax = int(box.find('./bndbox/xmax').text)
            ymax = int(box.find('./bndbox/ymax').text)
            coors = [xmin, ymin, xmax, ymax, name]
            if name=='car' or name=='rider':
                boxes.append(coors)
    
        # extract image dimensions
        width = int(root.find('.//size/width').text)
        height = int(root.find('.//size/height').text)
        return boxes, width, height 
    

    3) 最后,在 load_mask 中,需要添加 if-else 语句以相应地附加框。

    # load the masks for an image
    def load_mask(self, image_id):
        # get details of image
        info = self.image_info[image_id]
        # define box file location
        path = info['annotation']
        # load XML
        boxes, w, h = self.extract_boxes(path)
        # create one array for all masks, each on a different channel
        masks = zeros([h, w, len(boxes)], dtype='uint8')
        # create masks
        class_ids = list()
        for i in range(len(boxes)):
            box = boxes[i]
            row_s, row_e = box[1], box[3]
            col_s, col_e = box[0], box[2]
            if (box[4] == 'car'):
                masks[row_s:row_e, col_s:col_e, i] = 1
                class_ids.append(self.class_names.index('car'))
            else:
                masks[row_s:row_e, col_s:col_e, i] = 2
                class_ids.append(self.class_names.index('rider'))   
        return masks, asarray(class_ids, dtype='int32')
    

    就我而言,我需要 2 个类,并且 XML 文件中有许多可用的类。使用上面的代码,我得到了以下图像:

    【讨论】:

      猜你喜欢
      • 2020-07-28
      • 2019-03-24
      • 2019-03-09
      • 2020-01-02
      • 2023-03-12
      • 2018-04-04
      • 2019-08-17
      • 2020-10-28
      • 2018-07-05
      相关资源
      最近更新 更多