使用 scipy optimize 进行优化控制答案

【问题标题】：optimal control using scipy optimize使用 scipy optimize 进行优化控制
【发布时间】：2019-10-31 07:50:12
【问题描述】：

我正在尝试解决一个最优控制问题，其中成本函数为 J = x^T Q x + u^T R u 取决于 x_dot = A x + B u 以及 x 和 u 的界限。我知道有一些求解器，如 cvxpy、yalimp 等，可以做到这一点，但我想自己做，以便对编码和将来可能添加的一些其他参数有更好的了解。我附上了我写的代码。它运行但对所有时间步返回相同的值。我已将 x 和 u 堆叠为单个向量。我不知道这是否是正确的方法。我认为可以以更好/有效的方式编写代码。欢迎所有建议，非常感谢您提前提供任何帮助

灰

import numpy as np
import sympy as sp
import scipy.optimize as opt
import matplotlib.pyplot as plt

# Optimal Control Problem
# Cost, J = x.transpose() * Q * x + u.transpose() * R * u
# x_dot = A*x + B*u
# x_min < x < x_max
# u_min < x < u_max


class mpc_opt():

    def __init__(self):
        self.Q = sp.diag(0.5, 1, 0)  # state penalty matrix, Q
        self.R = sp.eye(2) # input penalty matrix
        self.A = sp.Matrix([[-0.79, -0.3, -0.1],[0.5, 0.82, 1.23], [0.52, -0.3, -0.5]])  # state matrix 
        self.B = sp.Matrix([[-2.04, -0.21], [-1.28, 2.75], [0.29, -1.41]])  # input matrix

        self.t = np.linspace(0, 1, 30)


    # reference trajectory  ## static!!!
    def ref_trajectory(self, i):  # y = 3*sin(2*pi*omega*t)
        # y = 3 * np.sin(2*np.pi*self.omega*self.t[i])
        x_ref = sp.Matrix([0, 1, 0])
        return x_ref
        # return sp.Matrix(([[self.t[i]], [y], [0]]))

    def cost_function(self, U, *args):
        t = args
        nx, nu = self.A.shape[-1], self.B.shape[-1]
        x0 = U[0:nx]
        u = U[nx:nx+nu]
        u = u.reshape(len(u), -1)
        x0 = x0.reshape(len(x0), -1)
        x1 = self.A * x0 + self.B * u
        # q = [x1[0], x1[1]]
        # pos = self.end_effec_pose(q)
        traj_ref = self.ref_trajectory(t)
        pos_error = x1 - traj_ref
        cost = pos_error.transpose() * self.Q * pos_error + u.transpose() * self.R * u
        return cost

    def cost_gradient(self, U, *args):
        t = args
        nx, nu = self.A.shape[-1], self.B.shape[-1]
        x0 = U[0:nx]
        u = U[nx:nx + nu]
        u = u.reshape(len(u), -1)
        x0 = x0.reshape(len(x0), -1)
        x1 = self.A * x0 + self.B * u
        traj_ref = self.ref_trajectory(t)
        pos_error = x1 - traj_ref
        temp1 = self.Q * pos_error
        cost_gradient = temp1.col_join(self.R * u)
        return cost_gradient


    def optimise(self, u0, t):
        umin = [-2., -3.]
        umax = [2., 3.]
        xmin = [-10., -9., -8.]
        xmax = [10., 9., 8.]
        bounds = ((xmin[0], xmax[0]), (xmin[1], xmax[1]), (xmin[2], xmax[2]), (umin[0], umax[0]), (umin[1], umax[1]))

        U = opt.minimize(self.cost_function, u0, args=(t), method='SLSQP', bounds=bounds, jac=self.cost_gradient,
                         options={'maxiter': 200, 'disp': True})
        U = U.x
        return U


if __name__ == '__main__':
    mpc = mpc_opt()
    x0, u0, = sp.Matrix([[0.1], [0.02], [0.05]]), sp.Matrix([[0.4], [0.2]])
    X, U = sp.zeros(len(x0), len(mpc.t)), sp.zeros(len(u0), len(mpc.t))
    U0 = sp.Matrix([x0, u0])
    nx, nu = mpc.A.shape[-1], mpc.B.shape[-1]
    for i in range(len(mpc.t)):
        print('i = :', i)
        result = mpc.optimise(U0, i)
        x0 = result[0:nx]
        u = result[nx:nx + nu]
        u = u.reshape(len(u), -1)
        x0 = x0.reshape(len(x0), -1)
        U[:, i], X[:, i] = u0, x0
        # x0 = mpc.A * x0 + mpc.B * u
        U0 = result

plt.plot(X[0, :], '--r')
plt.plot(X[1, :], '--b')
plt.plot(X[2, :], '*r')
plt.show()

【问题讨论】：

请确保您的示例代码在发布之前编译并运行。此示例引发了一个 IndexError: SLSQP Error: the length of bounds is not compatible with that of x0.，因为您的 bounds 变量缺少一个元素。
对此感到抱歉。我刚刚编辑了代码，它运行了。
完美。您能否解释一下：x_dot 是什么？是x的梯度吗？
x_dot 是 x 的 d/dt。（时间导数）。离散形式可以写成x(k+1) = A * x(k) + B * u(k)，其中k是采样时间

标签： python python-3.x scipy scipy-optimize scipy-optimize-minimize

【解决方案1】：

在process dynamics and control page for Model Predictive Control 上发布了一个使用Scipy.optimize.minimize 的类似MPC 应用程序（选择Show Python MPC）。它使用也可以以状态空间形式表示的一阶线性系统。这是动画的屏幕截图：

尽管这是我的课程网站，但此应用程序的功劳归于Junho Park。 linear MPC that uses MATLAB and Python Gekko 还有另一个教程。在您开发应用程序时，此源代码可能会对您有所帮助。

【讨论】：

【解决方案2】：

我注意到的几点：

1) 为什么使用sympy 矩阵？在我看来，numpy.matrix 或 numpy.array 在你的情况下就足够了。 sympy 通常在您想要使用符号表达式（例如符号微分）时使用。 注意：使用@ 运算符进行矩阵乘法，例如在 cost_gradient 中，您将使用 temp1 = self.Q @ pos_error

2) 您可以使用staticmethod decorator 将类方法表示为静态

3) 为什么要使用SLSQP的优化方法？它用于约束优化（不适用于您的问题）。如果我们在the notes 中查看scipy.optimize.minimize，我们可以看到有许多算法用于无约束优化（有界），它们可能会收敛得更快。在大多数情况下，您可以不指定任何内容，让算法自动决定。

4) 将绘图命令放在main 方法中，而不是放在外面

否则看起来没问题。不过我还没有测试过这个功能。

【讨论】：

感谢您查看此内容。 1) sympy 矩阵：我的原始作品中有非线性符号表达式。这里我刚刚给出了一个线性时不变模型。我同意 numpy 将是一个更好的选择。 3）SLSQP：再次用于包括未来的约束。我的主要困惑在于方程 x_dot = Ax + Bu 指定的系统动力学规范。上面的代码是跟踪x_ref中给出的引用状态，但不会。