【发布时间】:2021-05-11 17:45:45
【问题描述】:
我有以下代码,用户可以按p 暂停视频,在要跟踪的对象周围画一个边界框,然后按 Enter(回车)在视频源中跟踪该对象:
import cv2
import sys
major_ver, minor_ver, subminor_ver = cv2.__version__.split('.')
if __name__ == '__main__' :
# Set up tracker.
tracker_types = ['BOOSTING', 'MIL','KCF', 'TLD', 'MEDIANFLOW', 'GOTURN', 'MOSSE', 'CSRT']
tracker_type = tracker_types[1]
if int(minor_ver) < 3:
tracker = cv2.Tracker_create(tracker_type)
else:
if tracker_type == 'BOOSTING':
tracker = cv2.TrackerBoosting_create()
if tracker_type == 'MIL':
tracker = cv2.TrackerMIL_create()
if tracker_type == 'KCF':
tracker = cv2.TrackerKCF_create()
if tracker_type == 'TLD':
tracker = cv2.TrackerTLD_create()
if tracker_type == 'MEDIANFLOW':
tracker = cv2.TrackerMedianFlow_create()
if tracker_type == 'GOTURN':
tracker = cv2.TrackerGOTURN_create()
if tracker_type == 'MOSSE':
tracker = cv2.TrackerMOSSE_create()
if tracker_type == "CSRT":
tracker = cv2.TrackerCSRT_create()
# Read video
video = cv2.VideoCapture(0) # 0 means webcam. Otherwise if you want to use a video file, replace 0 with "video_file.MOV")
# Exit if video not opened.
if not video.isOpened():
print ("Could not open video")
sys.exit()
while True:
# Read first frame.
ok, frame = video.read()
if not ok:
print ('Cannot read video file')
sys.exit()
# Retrieve an image and Display it.
if((0xFF & cv2.waitKey(10))==ord('p')): # Press key `p` to pause the video to start tracking
break
cv2.namedWindow("Image", cv2.WINDOW_NORMAL)
cv2.imshow("Image", frame)
cv2.destroyWindow("Image");
# select the bounding box
bbox = (287, 23, 86, 320)
# Uncomment the line below to select a different bounding box
bbox = cv2.selectROI(frame, False)
# Initialize tracker with first frame and bounding box
ok = tracker.init(frame, bbox)
while True:
# Read a new frame
ok, frame = video.read()
if not ok:
break
# Start timer
timer = cv2.getTickCount()
# Update tracker
ok, bbox = tracker.update(frame)
# Calculate Frames per second (FPS)
fps = cv2.getTickFrequency() / (cv2.getTickCount() - timer);
# Draw bounding box
if ok:
# Tracking success
p1 = (int(bbox[0]), int(bbox[1]))
p2 = (int(bbox[0] + bbox[2]), int(bbox[1] + bbox[3]))
cv2.rectangle(frame, p1, p2, (255,0,0), 2, 1)
else :
# Tracking failure
cv2.putText(frame, "Tracking failure detected", (100,80), cv2.FONT_HERSHEY_SIMPLEX, 0.75,(0,0,255),2)
# Display tracker type on frame
cv2.putText(frame, tracker_type + " Tracker", (100,20), cv2.FONT_HERSHEY_SIMPLEX, 0.75, (50,170,50),2);
# Display FPS on frame
cv2.putText(frame, "FPS : " + str(int(fps)), (100,50), cv2.FONT_HERSHEY_SIMPLEX, 0.75, (50,170,50), 2);
# Display result
cv2.imshow("Tracking", frame)
# Exit if ESC pressed
k = cv2.waitKey(1) & 0xff
if k == 27 : break
现在,不是让用户暂停视频并在对象周围绘制边界框,而是如何使其能够自动检测我感兴趣的特定对象(在我的情况下是牙刷)在视频提要中引入,然后进行跟踪?
我找到了this 文章,其中讨论了我们如何使用 ImageAI 和 Yolo 检测视频中的对象。
from imageai.Detection import VideoObjectDetection
import os
import cv2
execution_path = os.getcwd()
camera = cv2.VideoCapture(0)
detector = VideoObjectDetection()
detector.setModelTypeAsYOLOv3()
detector.setModelPath(os.path.join(execution_path , "yolo.h5"))
detector.loadModel()
video_path = detector.detectObjectsFromVideo(camera_input=camera,
output_file_path=os.path.join(execution_path, "camera_detected_1")
, frames_per_second=29, log_progress=True)
print(video_path)
现在,Yolo 确实可以检测到牙刷,它是默认可以检测到的 80 多种物体之一。然而,这篇文章有两点让我觉得它不是理想的解决方案:
-
此方法首先分析每个视频帧(每帧大约需要 1-2 秒,因此分析来自网络摄像头的 2-3 秒视频流大约需要 1 分钟),并将检测到的视频保存在单独的视频文件中。然而,我想实时检测网络摄像头视频源中的牙刷。有解决办法吗?
-
正在使用的 Yolo v3 模型可以检测所有 80 个对象,但我只想检测 2 或 3 个对象 - 牙刷、拿着牙刷的人以及可能需要的背景。那么,有没有一种方法可以通过仅选择这 2 个或 3 个对象来检测来减少模型重量?
【问题讨论】:
-
你不使用暗网框架?
-
我对此一无所知。我在计算机视觉领域没有太多经验,我只是想进入它。所以,如果你认为暗网可以帮助解决这个问题,如果你能写一个关于如何解决的答案,我将不胜感激。
标签: python-3.x opencv deep-learning object-detection video-tracking