使用多台相机进行 3D 点投影答案

【问题标题】：3D points projection using multiple camera使用多台相机进行 3D 点投影
【发布时间】：2019-09-13 12:42:43
【问题描述】：

我正在使用 Python 和 OpenCV 3.4。

我有一个由 2 个摄像头组成的系统，我想用它来跟踪一个物体并获取它的轨迹，然后是它的速度。

我目前能够对我的每台相机进行内在和外在的校准。我可以通过视频跟踪我的对象并在我的视频计划中获取 2d 坐标。

我现在的问题是我想将我的点从我的两个 2D 计划投影到 3D 点。我已经尝试过triangulatePoints 的功能，但它似乎没有以正确的方式工作。这是我获取 3d 坐标的实际功能。它返回一些与实际坐标相比似乎有点偏离的坐标

def get_3d_coord(left_two_d_coords, right_two_d_coords):

    pt1 = left_two_d_coords.reshape((len(left_two_d_coords), 1, 2))
    pt2 = right_two_d_coords.reshape((len(right_two_d_coords), 1, 2))

    extrinsic_left_camera_matrix, left_distortion_coeffs, extrinsic_left_rotation_vector, \
        extrinsic_left_translation_vector = trajectory_utils.get_extrinsic_parameters(
            1)

    extrinsic_right_camera_matrix, right_distortion_coeffs, extrinsic_right_rotation_vector, \
        extrinsic_right_translation_vector = trajectory_utils.get_extrinsic_parameters(
            2)

    #returns arrays of the same size
    (pt1, pt2) = correspondingPoints(pt1, pt2)



    projection1 = computeProjMat(extrinsic_left_camera_matrix,
                                    extrinsic_left_rotation_vector, extrinsic_left_translation_vector)
    projection2 = computeProjMat(extrinsic_right_camera_matrix,
                                    extrinsic_right_rotation_vector, extrinsic_right_translation_vector)

    out = cv2.triangulatePoints(projection1, projection2, pt1, pt2)

    oc = []
    for idx, elem in enumerate(out[0]):
        oc.append((out[0][idx], out[1][idx], out[2][idx], out[3][idx]))

    oc = np.array(oc, dtype=np.float32)

    point3D = []

    for idx, elem in enumerate(oc):
        W = out[3][idx]
        obj = [None] * 4
        obj[0] = out[0][idx] / W
        obj[1] = out[1][idx] / W
        obj[2] = out[2][idx] / W
        obj[3] = 1

        pt3d = [obj[0], obj[1], obj[2]]
        point3D.append(pt3d)

    return point3D

以下是我为两台相机获得的 2d 轨迹截图：

以下是我们为同一相机获得的 3d 轨迹的一些屏幕截图。

如您所见，2d 轨迹看起来不像 3d 轨迹，我无法获得两点之间的准确距离。我只是想获得真实的坐标，这意味着即使在弯曲的道路上也要知道一个人走过的（几乎）准确的真实距离。

编辑以添加参考数据和示例

这里有一些示例和输入数据来重现问题。首先，这里有一些数据。 camera1 的 2D 点

camera2对应的2D

我们从triangulatePoints获得的3D点

"[0.15245444, 0.30141047, 0.5444277]"
"[0.33479974, 0.6477136, 0.25396818]"
"[0.6559921, 1.0416716, -0.2717265]"
"[1.1381898, 1.5703914, -0.87318224]"
"[1.7568599, 1.9649554, -1.5008119]"
"[2.406788, 2.302272, -2.0778883]"
"[3.078426, 2.6655817, -2.6113863]"

在以下这些图像中，我们可以看到 2d 轨迹（顶线）和在 2d 中重新投影的 3d 投影（底线）。颜色交替显示哪些 3d 点对应于 2d 点。

最后这里有一些数据可以重现。

相机 1：相机矩阵

5.462001610064596662e+02 0.000000000000000000e+00 6.382260289544193483e+02
0.000000000000000000e+00 5.195528638702176067e+02 3.722480290221320161e+02
0.000000000000000000e+00 0.000000000000000000e+00 1.000000000000000000e+00

相机 2：相机矩阵

4.302353276501239066e+02 0.000000000000000000e+00 6.442674231451971991e+02
0.000000000000000000e+00 4.064124751062329324e+02 3.730721752718034736e+02
0.000000000000000000e+00 0.000000000000000000e+00 1.000000000000000000e+00

相机 1：畸变矢量

-1.039009381799949928e-02 -6.875769941694849507e-02 5.573643708806085006e-02 -7.298826373638074051e-04 2.195279856716004369e-02

相机 2：畸变矢量

-8.089289768586239993e-02 6.376634681503455396e-04 2.803641672679824115e-02 7.852965318823987989e-03 1.390248981867302919e-03

相机 1：旋转矢量

1.643658457134109296e+00
-9.626823326237364531e-02
1.019865700311696488e-01

相机 2：旋转矢量

1.698451227150894471e+00
-4.734769748661146055e-02
5.868343803315514279e-02

相机 1：平移向量

-5.004031689969588026e-01
9.358682517577661120e-01
2.317689087311113116e+00

相机 2：平移向量

-4.225788801112133619e+00
9.519952012307866251e-01
2.419197507326224184e+00

相机 1：物点

相机 2：物点

相机 1：图像点

5.180000000000000000e+02 5.920000000000000000e+02
5.480000000000000000e+02 4.410000000000000000e+02
6.360000000000000000e+02 5.910000000000000000e+02
6.020000000000000000e+02 4.420000000000000000e+02
7.520000000000000000e+02 5.860000000000000000e+02
6.500000000000000000e+02 4.430000000000000000e+02
8.620000000000000000e+02 5.770000000000000000e+02
7.000000000000000000e+02 4.430000000000000000e+02
9.600000000000000000e+02 5.670000000000000000e+02
7.460000000000000000e+02 4.430000000000000000e+02

相机 2：图像点

6.080000000000000000e+02 5.210000000000000000e+02
6.080000000000000000e+02 4.130000000000000000e+02
7.020000000000000000e+02 5.250000000000000000e+02
6.560000000000000000e+02 4.140000000000000000e+02
7.650000000000000000e+02 5.210000000000000000e+02
6.840000000000000000e+02 4.150000000000000000e+02
8.400000000000000000e+02 5.190000000000000000e+02
7.260000000000000000e+02 4.160000000000000000e+02
9.120000000000000000e+02 5.140000000000000000e+02
7.600000000000000000e+02 4.170000000000000000e+02

【问题讨论】：

如果可能，请提供您的测试样本（图片）和预期/返回的结果。
@Oliort 我添加了一些图片来说明我的问题，如果您需要更多，请告诉我。
请描述如何在您添加的 2d 图像上绘制 3d 轨迹。您是否使用相同的投影参数将接收到的 3d 点投影到您计算它们的 2d 图像上？两个图像的点集大小是否相等？它接缝在第二张图像上，轨迹更长。那么第二张图像轨迹的某些部分是否由在第一张图像上没有对应位置的点组成？
在我看来，投影 3d 轨迹的左侧部分（如果这样做的话）对于两个 3d 结果都非常好。不是吗？
@Oliort 前两个图像只是我们收集的绘制在图片上的 2D 点。最后两张图像是 3D 点的投影。为了重新投影我们的 3D 点，我们将它们乘以相机的投影矩阵，然后除以 z 给出的 x 和 y。你是对的，我们的第一组点看起来是正确的，但并非总是如此。我对我们解决方案的责任心存疑虑。您对此有何看法？您将如何解决这个问题？

标签： python opencv 3d-reconstruction

【解决方案1】：

假设您的两个分辨率都是 1280x720，我计算了左侧摄像机的旋转和平移。

left_obj = np.array([[
        [0, 0, 0],   
        [0, 3, 0],   
        [0.5, 0, 0], 
        [0.5, 3, 0], 
        [1, 0, 0],  
        [1 ,3, 0], 
        [1.5, 0, 0], 
        [1.5, 3, 0], 
        [2, 0, 0],   
        [2, 3, 0] 
    ]], dtype=np.float32)

left_img = np.array([[
        [5.180000000000000000e+02, 5.920000000000000000e+02],
        [5.480000000000000000e+02, 4.410000000000000000e+02],
        [6.360000000000000000e+02, 5.910000000000000000e+02],
        [6.020000000000000000e+02, 4.420000000000000000e+02],
        [7.520000000000000000e+02, 5.860000000000000000e+02],
        [6.500000000000000000e+02, 4.430000000000000000e+02],
        [8.620000000000000000e+02, 5.770000000000000000e+02],
        [7.000000000000000000e+02, 4.430000000000000000e+02],
        [9.600000000000000000e+02, 5.670000000000000000e+02],
        [7.460000000000000000e+02, 4.430000000000000000e+02]
    ]], dtype=np.float32)
    
left_camera_matrix = np.array([
        [4.777926320579549042e+02, 0.000000000000000000e+00, 5.609694925007885331e+02],
        [0.000000000000000000e+00, 2.687583555325996372e+02, 5.712247987054799978e+02],
        [0.000000000000000000e+00, 0.000000000000000000e+00, 1.000000000000000000e+00]
    ])

    
left_distortion_coeffs = np.array([
        -8.332059138465927606e-02,
        -1.402986394998156472e+00,
        2.843132503678651168e-02, 
        7.633417606366312003e-02, 
        1.191317644548635979e+00
    ])

ret, left_camera_matrix, left_distortion_coeffs, rot, trans = cv2.calibrateCamera(left_obj, left_img, (1280, 720),
            left_camera_matrix, left_distortion_coeffs, None, None, cv2.CALIB_USE_INTRINSIC_GUESS)
print(rot[0])
print(trans[0])

我得到了不同的结果：

[[ 2.7262137 ] [-0.19060341] [-0.30345874]]

[[-0.48068581] [ 0.75257108] [ 1.80413094]]

右侧摄像头也一样：

[[ 2.1952522 ] [ 0.20281459] [-0.46649734]]

[[-2.96484428] [-0.0906817] [3.84203022]]

您可以通过以下方式大致检查旋转：计算计算结果之间的相对旋转，并与真实相机位置之间的相对旋转进行比较。平移：计算计算结果之间的相对归一化平移向量，并与真实相机位置之间的归一化相对平移进行比较。描述了 OpenCV 使用的坐标系 here。

【讨论】：

我们目前正在使用该过程，我们为 3D 投影拍摄的每个点在两个相机中都有一个对应点。我们基本上同步两个视频，然后将具有相同帧号的 2d 点放在一起。之后，我们将triangulatePoints 应用于我们的两个二维轨迹。
@Q.Eude 引用我的其他评论“如果不是这种情况，您能否附上您为一对点（而不是整个轨迹）获得的错误结果”。请制作更稀疏的（更少点密度，更少点）轨迹并用不同颜色绘制点（相同的时间点 - 两个图像上的相同颜色，不同的时间 - 不同的点颜色）。我无法重现您的结果，因此我需要更详细的数据才能提供帮助。
我做了一些改变，你可以找到一些输入数据和更多的解释。
我确实使用棋盘 foreach 相机计算了我的内在矩阵。我的图像点和对象点仅用于外部校准。
你是如何计算这些的，我试过使用calibrateCamera 和solvePnpRansac。有没有办法检查这些矩阵是否正确？