【发布时间】:2017-07-18 20:26:18
【问题描述】:
我有一个代码来获取图像的宽度和高度,以及边界框的类、xmin、xmax、ymin、ymax。但不清楚如何填充变量以生成 tfrecords。根据下面的代码,
height = None # Image height
width = None # Image width
filename = None # Filename of the image. Empty if image is not from file
encoded_image_data = None # Encoded image bytes
image_format = None # b'jpeg' or b'png'
xmins = [] # List of normalized left x coordinates in bounding box (1 per box)
xmaxs = [] # List of normalized right x coordinates in bounding box # (1 per box)
ymins = [] # List of normalized top y coordinates in bounding box (1 per box)ymaxs = [] # List of normalized bottom y coordinates in bounding box # (1 per box)
classes_text = [] # List of string class name of bounding box (1 per box)
classes = [] # List of integer class id of bounding box (1 per box)
对于每个图像的多个边界框,应如何填充 xmin、xmax、ymin、ymax 和类?它们应该是行向量还是列向量?另外,对于类文本,它是否会根据边界框的顺序列出所有类名?另外,编码的图像数据是什么?
【问题讨论】:
标签: tensorflow object-detection bounding-box