训练后使预测更快[增加预测视频fps]答案

【问题标题】：Make prediction faster after training [ Increasing predicted video fps]训练后使预测更快[增加预测视频fps]
【发布时间】：2021-11-08 03:39:15
【问题描述】：

用 mobilenetV3Large 训练了一个模型，该模型执行 segmentation 过程，但在预测时间，它的 处理时间 不是那么好。 大约 FPS：3.95 .

我想让它至少 20fps。还附上示例代码。谢谢！

from imutils.video import VideoStream
from imutils.video import FPS
import numpy as np
import imutils
import time
import cv2


model = load_model('model.h5', custom_objects={'loss': loss, "dice_coefficient": dice_coefficient}, compile = False)

cap = VideoStream(src=0).start()
# warm up the camera for a couple of seconds
time.sleep(2.0)

# Start the FPS timer
fps = FPS().start()

while True:

    frame = cap.read()

    # Resize each frame
    resized_image = cv2.resize(frame, (256, 256))

    resized_image = tf.image.convert_image_dtype((resized_image/255.0), dtype=tf.float32).numpy()
    mask = model.predict(np.expand_dims(resized_image[:,:,:3], axis=0))[0]

    # show the output frame
    cv2.imshow("Frame", mask)

    key = cv2.waitKey(1) & 0xFF
    # Press 'q' key to break the loop
    if key == ord("q"):
        break

    # update the FPS counter
    fps.update()

# stop the timer
fps.stop()

# Display FPS Information: Total Elapsed time and an approximate FPS over the entire video stream
print("[INFO] Elapsed Time: {:.2f}".format(fps.elapsed()))
print("[INFO] Approximate FPS: {:.2f}".format(fps.fps()))

# Destroy windows and cleanup
cv2.destroyAllWindows()
# Stop the video stream
cap.stop()

EDIT-1

进行float16量化后，将模型加载为tflite_model，然后将输入（图像）输入模型。但结果更慢！！这是正确的方法吗？

interpreter = tf.lite.Interpreter('tflite_model.tflite')
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

....................... process .............

while True:
    
    .............  process ............
    
    interpreter.set_tensor(input_details[0]['index'], np.expand_dims(resized_image[:,:,:3], axis=0))
    interpreter.invoke()
    mask = interpreter.get_tensor(output_details[0]['index'])[0]
#     mask = model.predict(np.expand_dims(resized_image[:,:,:3], axis=0))[0]

    ............ display part ........

【问题讨论】：

使用不同的推理引擎。使用不同的硬件。使用量化（float16 或 int8）。使用修剪。使用更小的网络架构。
我用 float16 进行了量化，但没有增加它，而是降低了更多的 fps！使用“EDIT-1”过程进行模型预测
这是大规模逐帧预测的正确方法吗？

标签： python image tensorflow opencv video-processing

【解决方案1】：

可以通过不同的方式加快速度：

模型量化：

TensorFlow Lite 支持将权重转换为 16 位浮点也许这是保存模型的最简单方法
```
tf.float16
```
或者用 float16 或 float8 重新训练会更快 https://www.tensorflow.org/lite/performance/post_training_float16_quant
模型蒸馏：您可以训练小模型，该模型将使用模型的损失函数进行训练，并且可以从大模型中学习所有内容。
模型修剪您可以通过修剪来压缩模型，它会更快。关于修剪你也可以阅读 tensorflow 文档

【讨论】：

请参阅 EDIT-1 ！