Matlab 和 Python 中 fmin 和 fminsearch 的结果差异答案

【问题标题】：Difference in result between fmin and fminsearch in Matlab and PythonMatlab 和 Python 中 fmin 和 fminsearch 的结果差异
【发布时间】：2019-12-08 02:37:25
【问题描述】：

我的目标是对一些衰减数据（通过 CPMG 的 NMR T2 衰减）执行拉普拉斯逆变换。为此，我们获得了 CONTIN 算法。这个算法是adapted to Matlab by Iari-Gabriel Marino，效果很好。我想将此代码改编成 Python。问题的核心在于scipy.optimize.fmin，它不会以任何类似于 Matlab 的fminsearch 的方式最小化均方偏差 (MSD)。后者产生了很好的最小化，而前者没有。

我已经逐行浏览了我在 Python 和原始 Matlab 中改编的代码。我检查了每个矩阵和每个输出。我用它来确定关键点在fmin。我还尝试了scipy.optimize.minimize 和其他最小化算法，但都没有给出令人满意的结果。

我为 Python 和 Matlab 制作了两个 MWE，以使其对所有人都可重现。示例数据来自 matlab 函数的文档。抱歉，如果这是长代码，但我真的不知道如何在不牺牲可读性和清晰度的情况下缩短它。我试图让线条尽可能地匹配。我在 Windows 8.1 上使用 Python 3.7.3、scipy v1.3.0、numpy 1.16.2、Matlab R2018b。这是一个相对较新的 Anaconda 安装（

我的代码：

import numpy as np
from scipy.optimize import fmin
import matplotlib.pyplot as plt

def msd(g, y, A, alpha, R, w, constraints):
    """ msd: mean square deviation. This is the function to be minimized by fmin"""
    if 'zero_at_extremes' in constraints:
        g[0] = 0
        g[-1] = 0
    if 'g>0' in constraints:
        g = np.abs(g)

    r = np.diff(g, axis=0, n=2)
    yfit = A @ g
    # Sum of weighted square residuals
    VAR = np.sum(w * (y - yfit) ** 2)
    # Regularizor
    REG = alpha ** 2 * np.sum((r - R @ g) ** 2)
    # output to be minimized
    return VAR + REG

# Objective: match this distribution
g0 = np.array([0, 0, 10.1625, 25.1974, 21.8711, 1.6377, 7.3895, 8.736, 1.4256, 0, 0]).reshape((-1, 1))
s0 = np.logspace(-3, 6, len(g0)).reshape((-1, 1))
t = np.linspace(0.01, 500, 100).reshape((-1, 1))
sM, tM = np.meshgrid(s0, t)
A = np.exp(-tM / sM)
np.random.seed(1)
# Creates data from the initial distribution with some random noise.
data = (A @ g0) + 0.07 * np.random.rand(t.size).reshape((-1, 1))

# Parameters and function start
alpha = 1E-2  # regularization parameter
s = np.logspace(-3, 6, 20).reshape((-1, 1)) # x of the ILT
g0 = np.ones(s.size).reshape((-1, 1))        # guess of y of ILT
y = data                                    # noisy data
options = {'maxiter':1e8, 'maxfun':1e8}     # for the fmin function
constraints=['g>0', 'zero_at_extremes']     # constraints for the MSD function
R=np.zeros((len(g0) - 2, len(g0)), order='F')  # Regularizor
w=np.ones(y.reshape(-1, 1).size).reshape((-1, 1)) # Weights
sM, tM = np.meshgrid(s, t, indexing='xy')
A = np.exp(-tM/sM)
g0 = g0 * y.sum() / (A @ g0).sum()  # Makes a "better guess" for the distribution, according to algorithm

print('msd of input data:\n', msd(g0, y, A, alpha, R, w, constraints))

for i in range(5):  # Just for testing. If this is extremely high, ~1000, it's still bad.
    g = fmin(func=msd,
             x0 = g0, 
             args=(y, A, alpha, R, w, constraints), 
             **options, 
             disp=True)[:, np.newaxis]
    msdfit = msd(g, y, A, alpha, R, w, constraints)
    if 'zero_at_extremes' in constraints:
            g[0] = 0
            g[-1] = 0
    if 'g>0' in constraints:
            g = np.abs(g)

    g0 = g

print('New guess', g)
print('Final msd of g', msdfit)

# Visualize the fit
plt.plot(s, g, label='Initial approximation')
plt.plot(np.logspace(-3, 6, 11), np.array([0, 0, 10.1625, 25.1974, 21.8711, 1.6377, 7.3895, 8.736, 1.4256, 0, 0]), label='Distribution to match')
plt.xscale('log')
plt.legend()
plt.show()

Matlab：

% Objective: match this distribution
g0 = [0 0 10.1625 25.1974 21.8711 1.6377 7.3895 8.736 1.4256 0 0]';
s0  = logspace(-3,6,length(g0))';
t = linspace(0.01,500,100)';
[sM,tM] = meshgrid(s0,t);
A = exp(-tM./sM);
rng(1);
% Creates data from the initial distribution with some random noise.
data = A*g0 + 0.07*rand(size(t));

% Parameters and function start
alpha = 1e-2; % regularization parameter
s = logspace(-3,6,20)'; % x of the ILT
g0 = ones(size(s)); % initial guess of y of ILT
y = data; % noisy data
options = optimset('MaxFunEvals',1e8,'MaxIter',1e8); % constraints for fminsearch
constraints = {'g>0','zero_at_the_extremes'}; % constraints for MSD
R = zeros(length(g0)-2,length(g0));
w = ones(size(y(:)));
[sM,tM] = meshgrid(s,t);
A = exp(-tM./sM);
g0 = g0*sum(y)/sum(A*g0); % Makes a "better guess" for the distribution

disp('msd of input data:')
disp(msd(g0, y, A, alpha, R, w, constraints))

for k = 1:5
    [g,msdfit] = fminsearch(@msd,g0,options,y,A,alpha,R,w,constraints);
    if ismember('zero_at_the_extremes',constraints)
        g(1) = 0;
        g(end) = 0;
    end
    if ismember('g>0',constraints)
        g = abs(g);
    end
    g0 = g;
end

disp('New guess')
disp(g)
disp('Final msd of g')
disp(msdfit)

% Visualize the fit
semilogx(s, g)
hold on
semilogx(logspace(-3,6,11), [0 0 10.1625 25.1974 21.8711 1.6377 7.3895 8.736 1.4256 0 0])
legend('First approximation', 'Distribution to match')
hold off
function out = msd(g,y,A,alpha,R,w,constraints)
% msd: The mean square deviation; this is the function
% that has to be minimized by fminsearch

% Constraints and any 'a priori' knowledge
if ismember('zero_at_the_extremes',constraints)
    g(1) = 0;
    g(end) = 0;
end
if ismember('g>0',constraints)
    g = abs(g); % must be g(i)>=0 for each i
end
r = diff(diff(g(1:end))); % second derivative of g
yfit = A*g;
% Sum of weighted square residuals
VAR = sum(w.*(y-yfit).^2);
% Regularizor
REG = alpha^2 * sum((r-R*g).^2);
% Output to be minimized
out = VAR+REG;
end

这是Python中的优化

这是Matlab中的优化

我在开始之前检查了g0的MSD的输出，都给出了2651的值。最小化后，Python上升到4547，Matlab下降到0.1381。

我认为问题是以下之一。它在我的实现中，即我使用了fmin 错误，或者我错了其他一些段落，但我不知道是什么。 MSD 在本应通过最小化函数减少时增加的事实是该死的。阅读文档，scipy 实现与 Matlab 的不同（它们使用 Lagarias 中描述的 Nelder Mead 方法，per their documentation），而 scipy 使用原始的 Nelder Mead）。也许这会影响很大？或者也许我最初的猜测对于 scipy 的算法来说太糟糕了？

【问题讨论】：

标签： python matlab scipy-optimize

【解决方案1】：

所以，自从我发布这篇文章以来已经有很长时间了，但我想分享我最终学习和做的事情。

CPMG 数据的拉普拉斯逆变换有点用词不当，它更恰当地称为逆变换。一般问题是求解第一类 Fredholm 积分。这样做的一种方法是 Tikhonov 正则化方法。事实证明，你可以很容易地使用 numpy 描述这个问题，并使用 scipy 包解决它，所以我不必用这个“重新发明”轮子。

我使用了in this post 显示的解决方案，这里的名称反映了该解决方案。

def tikhonov_regularized_inversion(
    kernel: np.ndarray, alpha: float, data: np.ndarray
) -> np.ndarray:
    data = data.reshape(-1, 1)
    I = alpha * np.eye(*kernel.shape)
    C = np.concatenate([kernel, I], axis=0)
    d = np.concatenate([data, np.zeros_like(data)])
    x, _ = nnls(C, d.flatten())

这里，kernel 是一个包含所有可能的指数衰减曲线的矩阵，我的解决方案判断每条衰减曲线在我收到的数据中的贡献。首先，我将数据堆叠为一列，然后用零填充，创建向量d。然后，我将内核堆叠在一个对角矩阵的顶部，该矩阵包含沿对角线的正则化参数alpha，与内核大小相同。最后，我称之为方便的nnls，scipy.optimize 中的非负最小二乘求解器。这是因为没有理由有负贡献，只有没有贡献。

这解决了我的问题，又快又方便。

【讨论】：