Numpy 的 'linalg.solve' 和 'linalg.lstsq' 给出的答案与 Matlab 的 '\' 或 mldivide 不同答案

【问题标题】：Numpy's 'linalg.solve' and 'linalg.lstsq' not giving same answer as Matlab's '\' or mldivideNumpy 的 'linalg.solve' 和 'linalg.lstsq' 给出的答案与 Matlab 的 '\' 或 mldivide 不同
【发布时间】：2014-09-20 00:35:48
【问题描述】：

我正在尝试在 Python 上实现最小二乘曲线拟合算法，已经在 Matlab 上编写了它。但是，我无法获得正确的变换矩阵，并且问题似乎发生在求解步骤中。（编辑：我的变换矩阵在 Matlab 中非常准确，但在 Python 中完全不适用。）

我查看了许多在线资源，它们都表明要翻译 Matlab 的“mldivide”，如果矩阵是方形且非奇异的，则必须使用“np.linalg.solve”，以及“np.linalg.lstsq” ' 否则。然而我的结果不匹配。

有什么问题？如果和函数的实现有关，那么在numpy中mldivide的正确翻译是什么？

我附上了下面两个版本的代码。它们本质上是完全相同的实现，除了求解部分。

Matlab 代码：

%% Least Squares Fit

clear, clc, close all

% Calibration Data
scr_calib_pts = [0,0; 900,900; -900,900; 900,-900; -900,-900];
cam_calib_pts = [-1,-1; 44,44; -46,44; 44,-46; -46,-46];
cam_x = cam_calib_pts(:,1);
cam_y = cam_calib_pts(:,2);

% Least Squares Fitting
A_matrix = [];
for i = 1:length(cam_x)
    A_matrix = [A_matrix;1, cam_x(i), cam_y(i), ...
        cam_x(i)*cam_y(i), cam_x(i)^2, cam_y(i)^2];
end
A_star = A_matrix'*A_matrix
B_star = A_matrix'*scr_calib_pts
transform_matrix = mldivide(A_star,B_star)

% Testing Data
test_scr_vec = [200,400; 1600,400; -1520,1740; 1300,-1800; -20,-1600];
test_cam_vec = [10,20; 80,20; -76,87; 65,-90; -1,-80];
test_cam_x = test_cam_vec(:,1);
test_cam_y = test_cam_vec(:,2);

% Coefficients for Transform
coefficients = [];
for i = 1:length(test_cam_x)
    coefficients = [coefficients;1, test_cam_x(i), test_cam_y(i), ...
        test_cam_x(i)*test_cam_y(i), test_cam_x(i)^2, test_cam_y(i)^2];
end

% Mapped Points
results = coefficients*transform_matrix;

% Plotting
test_scr_x = test_scr_vec(:,1)';
test_scr_y = test_scr_vec(:,2)';
results_x = results(:,1)';
results_y = results(:,2)';

figure
hold on
load seamount
s = 50;
scatter(test_scr_x, test_scr_y, s, 'r')
scatter(results_x, results_y, s)

Python 代码：

# Least Squares fit

import numpy as np
import matplotlib.pyplot as plt

# Calibration data
camera_vectors = np.array([[-1,-1], [44,44], [-46,44], [44,-46], [-46,-46]])
screen_vectors = np.array([[0,0], [900,900], [-900,900], [900,-900], [-900,-900]])

# Separate axes
cam_x = np.array([i[0] for i in camera_vectors])
cam_y = np.array([i[1] for i in camera_vectors])

# Initiate least squares implementation
A_matrix = []
for i in range(len(cam_x)):
    new_row = [1, cam_x[i], cam_y[i], \
        cam_x[i]*cam_y[i], cam_x[i]**2, cam_y[i]**2]
    A_matrix.append(new_row)
A_matrix = np.array(A_matrix)
A_star = np.transpose(A_matrix).dot(A_matrix)

B_star = np.transpose(A_matrix).dot(screen_vectors)

print A_star
print B_star

try:
    # Solve version (Implemented)
    transform_matrix = np.linalg.solve(A_star,B_star)
    print "Solve version"
    print transform_matrix
except:
    # Least squares version (implemented)
    transform_matrix = np.linalg.lstsq(A_star,B_star)[0]
    print "Least Squares Version"
    print transform_matrix


# Test data
test_cam_vec = np.array([[10,20], [80,20], [-76,87], [65,-90], [-1,-80]])
test_scr_vec = np.array([[200,400], [1600,400], [-1520,1740], [1300,-1800], [-20,-1600]])

# Coefficients of quadratic equation
test_cam_x = np.array([i[0] for i in test_cam_vec])
test_cam_y = np.array([i[1] for i in test_cam_vec])    
coefficients = []
for i in range(len(test_cam_x)):
    new_row = [1, test_cam_x[i], test_cam_y[i], \
        test_cam_x[i]*test_cam_y[i], test_cam_x[i]**2, test_cam_y[i]**2]
    coefficients.append(new_row)
coefficients = np.array(coefficients)

# Transform camera coordinates to screen coordinates
results = coefficients.dot(transform_matrix)

# Plot points
results_x = [i[0] for i in results]
results_y = [i[1] for i in results]
actual_x = [i[0] for i in test_scr_vec]
actual_y = [i[1] for i in test_scr_vec]

plt.plot(results_x, results_y, 'gs', actual_x, actual_y, 'ro')
plt.show()

编辑（根据建议）：

# Transform matrix with linalg.solve

[[  2.00000000e+01   2.00000000e+01]
 [ -5.32857143e+01   7.31428571e+01]
 [  7.32857143e+01  -5.31428571e+01]
 [ -1.15404203e-17   9.76497106e-18]
 [ -3.66428571e+01   3.65714286e+01]
 [  3.66428571e+01  -3.65714286e+01]]

# Transform matrix with linalg.lstsq:

[[  2.00000000e+01   2.00000000e+01]
 [  1.20000000e+01   8.00000000e+00]
 [  8.00000000e+00   1.20000000e+01]
 [  1.79196935e-15   2.33146835e-15]
 [ -4.00000000e+00   4.00000000e+00]
 [  4.00000000e+00  -4.00000000e+00]]

% Transform matrix with mldivide:

   20.0000   20.0000
   19.9998    0.0002
    0.0002   19.9998
         0         0
   -0.0001    0.0001
    0.0001   -0.0001

【问题讨论】：

如果您打印（并在问题中显示）输入 A_star 和 B_star 以及结果 transform_matrix，这将有助于 matlab 和 python 代码。
已添加。 A_stars 和 B_stars 都是相同的

标签： python matlab numpy

【解决方案1】：

有趣的是，np.linalg.lstsq 和 np.linalg.solve 会得到完全不同的结果。

x1 = np.linalg.lstsq(A_star, B_star)[0]
x2 = np.linalg.solve(A_star, B_star)

两者都应该提供方程 Ax = B 的解。但是，它们给出了两个完全不同的数组：

In [37]: x1    
Out[37]: 
array([[  2.00000000e+01,   2.00000000e+01],
       [  1.20000000e+01,   7.99999999e+00],
       [  8.00000001e+00,   1.20000000e+01],
       [ -1.15359111e-15,   7.94503352e-16],
       [ -4.00000001e+00,   3.99999999e+00],
       [  4.00000001e+00,  -3.99999999e+00]]


In [39]: x2
Out[39]: 
array([[  2.00000000e+01,   2.00000000e+01],
       [ -4.42857143e+00,   2.43809524e+01],
       [  2.44285714e+01,  -4.38095238e+00],
       [ -2.88620104e-18,   1.33158696e-18],
       [ -1.22142857e+01,   1.21904762e+01],
       [  1.22142857e+01,  -1.21904762e+01]])

两者都应该给出线性方程组的准确（直到计算精度）解，并且对于非奇异矩阵，只有一个解。

那么一定有什么地方出错了。让我们看看这两个候选者是否可以成为原始方程的解：

In [41]: A_star.dot(x1)
Out[41]: 
array([[ -1.11249392e-08,   9.86256055e-09],
       [  1.62000000e+05,  -1.65891834e-09],
       [  0.00000000e+00,   1.62000000e+05],
       [ -1.62000000e+05,  -1.62000000e+05],
       [ -3.24000000e+05,   4.47034836e-08],
       [  5.21540642e-08,  -3.24000000e+05]])

In [42]: A_star.dot(x2)
Out[42]: 
array([[ -1.45519152e-11,   1.45519152e-11],
       [  1.62000000e+05,  -1.45519152e-11],
       [  0.00000000e+00,   1.62000000e+05],
       [ -1.62000000e+05,  -1.62000000e+05],
       [ -3.24000000e+05,   0.00000000e+00],
       [  2.98023224e-08,  -3.24000000e+05]])

他们似乎给出了相同的解决方案，这与B_star 应该是相同的。这将我们引向解释。使用简单的线性代数，我们可以预测 A 。 (x1-x2) 应该非常接近于零：

In [43]: A_star.dot(x1-x2)
Out[43]: 
array([[ -1.11176632e-08,   9.85164661e-09],
       [ -1.06228981e-09,  -1.60071068e-09],
       [  0.00000000e+00,  -2.03726813e-10],
       [ -6.72298484e-09,   4.94765118e-09],
       [  5.96046448e-08,   5.96046448e-08],
       [  2.98023224e-08,   5.96046448e-08]])

确实如此。所以，方程 Ax = 0 似乎有一个非平凡解，解是 x = x1-x2，这意味着矩阵是奇异的，因此 Ax = B 有无数种不同的解。

因此问题不在 NumPy 或 Matlab 中，而在矩阵本身。

但是，在这个矩阵的情况下，情况有点棘手。 A_star 根据上面的定义似乎是单数的（对于 x0，Ax=0）。另一方面，它的行列式是非零的，也不是奇异的。

在这种情况下，A_star 是一个数值不稳定但不是奇异矩阵的示例。 solve 方法通过使用简单的逆乘法来解决它。在这种情况下，这是一个糟糕的选择，因为逆包含非常大和非常小的值。这使得乘法容易出现舍入误差。这可以通过查看矩阵的条件数看出：

In [65]: cond(A_star)
Out[65]: 1.3817810855559592e+17

这是一个很高的条件数，矩阵是病态的。

在这种情况下，使用逆来解决问题是一种不好的方法。如您所见，最小二乘法可以提供更好的结果。但是，更好的解决方案是重新调整输入值，使 x 和 x^2 在同一范围内。一种非常好的缩放比例是在 -1 和 1 之间缩放。

您可能会考虑的一件事是尝试使用 NumPy 的索引功能。例如：

cam_x = np.array([i[0] for i in camera_vectors])

相当于：

cam_x = camera_vectors[:,0]

你可以这样构造你的数组A：

A_matrix = np.column_stack((np.ones_like(cam_x), cam_x, cam_y, cam_x*cam_y, cam_x**2, cam_y**2))

无需创建列表列表或任何循环。

【讨论】：

但是 Matlab 是如何给出如此准确的答案的呢？我正在测试很多点，而且都非常准确。
我在调用矩阵单数时有点仓促。它不是看起来像一个。这只是一个病态的，请参阅我的更新答案。 lstsq 是否给了您期望的答案（即我的答案中的 x1）？
哇，缩小实际上让一切变得更好。为什么会这样？ THHAANNKKSS :)
按比例缩小可确保加法和减法具有合理的准确性。如果你加上 x+x^2，如果你有 1+1^2 或 1e-6+1e-12，它会产生很大的数值差异。在后一种情况下，有人只是偷了你 6 位数（或 20 位）的动态范围。线性代数计算有很多减法和加法，所以这对舍入误差有很大的影响。
你确定这一点：solve 方法通过使用简单的乘法来解决它。 ?

【解决方案2】：

矩阵 A_matrix 是一个 6 x 5 矩阵，所以 A_star 是一个奇异矩阵。结果没有唯一解，两个程序的结果都是正确的。这对应于原始问题未确定，而不是过度确定。

【讨论】：