视频帧作为 Tensorflow 图的输入答案

【问题标题】：Video frames as inputs to the Tensorflow graph视频帧作为 Tensorflow 图的输入
【发布时间】：2017-08-16 03:58:56
【问题描述】：

更具体地说，如何创建一个自定义阅读器，从视频中读取帧并将它们输入到 tensorflow 模型图中。

其次，如果可能的话，如何使用 opencv 解码帧以创建自定义阅读器？

是否有任何代码可以更好地展示心中的目的（在 python 中）？

我主要致力于通过面部表情进行情绪识别，我的数据库中有视频作为输入。

最后，我尝试使用 Queue 和 QueueRunner 与 Coordinator 希望解决手头的问题。根据https://www.tensorflow.org/programmers_guide/threading_and_queues 中的文档，QueueRunner 运行入队操作，该操作依次执行一个操作来创建一个示例（我们可以在此操作中使用opencv 创建一个示例，将帧作为示例返回到入队吗？ )

请注意，我的目的是让入队和出队操作在不同线程上同时发生。

以下是我目前的代码：

def deform_images(images):
    with tf.name_scope('current_image'):
        frames_resized = tf.image.resize_images(images, [90, 160])
        frame_gray = tf.image.rgb_to_grayscale(frames_resized, name='rgb_to_gray')
        frame_normalized = tf.divide(frame_gray, tf.constant(255.0), name='image_normalization')

        tf.summary.image('image_summmary', frame_gray, 1)
        return frame_normalized

def queue_input(video_path, coord):
    global frame_index
    with tf.device("/cpu:0"):
        # keep looping infinitely

        # source: http://stackoverflow.com/questions/33650974/opencv-python-read-specific-frame-using-videocapture
        cap = cv2.VideoCapture(video_path)
        cap.set(1, frame_index)

        # read the next frame from the file, Note that frame is returned as a Mat.
        # So we need to convert that into a tensor.
        (grabbed, frame) = cap.read()

        # if the `grabbed` boolean is `False`, then we have
        # reached the end of the video file
        if not grabbed:
            coord.request_stop()
            return

        img = np.asarray(frame)
        frame_index += 1
        to_retun = deform_images(img)
        print(to_retun.get_shape())
        return to_retun

frame_num = 1

with tf.Session() as sess:
    merged = tf.summary.merge_all()
    train_writer = tf.summary.FileWriter('C:\\Users\\temp_user\\Documents\\tensorboard_logs', sess.graph)
    tf.global_variables_initializer()

    coord = tf.train.Coordinator()
    queue = tf.FIFOQueue(capacity=128, dtypes=tf.float32, shapes=[90, 160, 1])
    enqueue_op = queue.enqueue(queue_input("RECOLA-Video-recordings\\P16.mp4", coord))

    # Create a queue runner that will run 1 threads in parallel to enqueue
    # examples. In general, the queue runner class is used to create a number of threads cooperating to enqueue
    # tensors in the same queue.
    qr = tf.train.QueueRunner(queue, [enqueue_op] * 1)

    # Create a coordinator, launch the queue runner threads.
    # Note that the coordinator class helps multiple threads stop together and report exceptions to programs that wait
    # for them to stop.
    enqueue_threads = qr.create_threads(sess, coord=coord, start=True)

    # Run the training loop, controlling termination with the coordinator.
    for step in range(8000):
        print(step)
        if coord.should_stop():
            break

        frames_tensor = queue.dequeue(name='dequeue')
        step += 1

    coord.join(enqueue_threads)

train_writer.close()
cv2.destroyAllWindows()

谢谢！！

【问题讨论】：

这段代码的性能如何。我需要做一些类似的工作。如果我能从已经做过的人那里得到一些反馈，那就更好了。

标签： python opencv tensorflow video-streaming video-processing

【解决方案1】：

tf.QueueRunner 不是最适合您目的的机制。在您拥有的代码中，以下行

enqueue_op = queue.enqueue(queue_input("RECOLA-Video-recordings\\P16.mp4", coord))

创建enqueue_op，它将一个常量张量排入队列，即每次运行时从queue_input函数返回的第一帧。即使QueueRunner 反复调用它，它总是将相同的张量排入队列，即在操作创建期间提供给它的张量。相反，您可以简单地使enqueue 操作将tf.placeholder 作为其参数，并在循环中重复运行它，将您通过OpenCV 抓取的帧提供给它。这是一些指导您的代码。

frame_ph = tf.placeholder(tf.float32)
enqueue_op = queue.enqueue(frame_ph)

def enqueue():
  while not coord.should_stop():
    frame = queue_input(video_path, coord)
    sess.run(enqueue_op, feed_dict={frame_ph: frame})

threads = [threading.Thread(target=enqueue)]

for t in threads:
  t.start()

# Your dequeue and training code goes here
coord.join(threads)

【讨论】：

【解决方案2】：

pip install video2tfrecord

说明

在一个研究项目中，我遇到了用 Python 从原始视频材料生成 tfrecord。遇到许多与此线程非常相似的类似请求，我在

下提供了我的部分代码

https://github.com/ferreirafabio/video2tfrecords

【讨论】：