Python中多元5度多项式回归的曲面图答案

【问题标题】：Surface plot for multivariate 5 degree polynomial regression in PythonPython中多元5度多项式回归的曲面图
【发布时间】：2017-12-11 04:14:41
【问题描述】：

我正在用 Python 实现一篇论文，该论文最初是在 MATLAB 中实现的。该论文说，使用来自一组采样数据点的曲线拟合找到了一个五次多项式。我不想使用他们的多项式，所以我开始使用样本数据点（在论文中给出）并尝试使用 sklearn 多项式特征和 linear_model 找到一个 5 度多项式。因为它是一个多元方程 f(x,y)，其中 x 和 y 是某个池塘的长度和宽度，f 是污染物的初始浓度。

所以我的问题是 sklearn 多项式特征将测试和训练数据点转换为 n 多项式点（据我所知）。但是，当我需要 clf.predict 函数（其中 clf 是经过训练的模型）仅获取 x 和 y 值时，因为当我从 Matplotlib 绘制曲面图时，它需要网格网格，所以当我meshgrid 我的 sklean 转换测试点，它的形状变成了 NxN，而预测函数需要 Nxn（其中 n 是它转换数据的多项式的次数），N 是行数。

有没有办法为这个多项式绘制网格点？

论文链接：http://www.ajer.org/papers/v5(11)/A05110105.pdf 论文题目：基于二维的兼性污水稳定池生物需氧量数学模型对流-扩散模型

如果可能，请查看论文中的图 5 和图 6（上面的链接）。这就是我想要达到的目标。

谢谢 enter code here

from math import exp
import numpy as np
from operator import itemgetter
from sklearn.preprocessing import PolynomialFeatures
from sklearn import linear_model
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
from matplotlib import cm
from matplotlib.ticker import LinearLocator, FormatStrFormatter

fig = plt.figure()
ax = fig.gca(projection='3d')


def model_BOD (cn):
    cnp1 = []
    n = len(cn)
    # variables:
    dmx = 1e-5
    dmy = 1e-5
    u = 2.10e-4
    v = 2.10e-4
    obs_time = 100
    dt = 0.1

    for t in np.arange(0.1,obs_time,dt):
        for i in range(N):
            for j in range(N):
                d = j + (i-1)*N
                dxp1 = d  + N
                dyp1 = d + 1
                dxm1 = d - N
                dym1 = d - 1

                cnp1.append(t*(((-2*dmx/dx**2)+(-2*dmy/dy**2)+(1/t))*cn[dxp1] + (dmx/dx**2)*cn[dyp1] \
                                + (dmy/dy**2)*cn[dym1] - (u/(2*dx))*cn[dxp1] + (u/(2*dx))*cn[dxm1] \
                                - (v/(2*dy))*cn[dyp1] + (v/(2*dy))*cn[dym1]))
        cn = cnp1
        cnp1 = []
    return cn

N = 20
Length = 70
Width = 77
dx = Length/N
dy = Width/N

deg_of_poly = 5

datapoints = np.array([
    [12.5,70,81.32],[25,70,88.54],[37.5,70,67.58],[50,70,55.32],[62.5,70,56.84],[77,70,49.52],
    [0,11.5,71.32],[77,57.5,67.20],
    [0,23,58.54],[25,46,51.32],[37.5,46,49.52],
    [0,34.5,63.22],[25,34.5,48.32],[37.5,34.5,82.30],[50,34.5,56.42],[77,34.5,48.32],[37.5,23,67.32],
    [0,46,64.20],[77,11.5,41.89],[77,46,55.54],[77,23,52.22],
    [0,57.5,93.72],
    [0,70,98.20],[77,0,42.32]
    ])

X = datapoints[:,0:2]
Y = datapoints[:,-1]

predict_x = []
predict_y = []
for i in np.linspace(0,Width,N):
    for j in np.linspace(0,Length,N):
        predict_x.append([i,j])

predict = np.array(predict_x)

poly = PolynomialFeatures(degree=deg_of_poly)
X_ = poly.fit_transform(X)

predict_ = poly.fit_transform(predict)
clf = linear_model.LinearRegression()
clf.fit(X_, Y)
prediction = []

for k,i in enumerate(predict_):
    prediction.append(clf.predict(np.array([i]))[0])

prediction_ = model_BOD(prediction)
print prediction_
XX = []
XX = predict[:,0]
YY = []
YY = predict[:,-1]
XX,YY = np.meshgrid(X,Y)
Z = prediction
##R = np.sqrt(XX**2+YY**2)
##Z = np.tan(R)

surf = ax.plot_surface(XX,YY,Z)
plt.show()

【问题讨论】：

标签： python matplotlib scikit-learn non-linear-regression numerical-analysis

【解决方案1】：

如果我理解正确，这里的关键逻辑是从您的网格网格生成多项式特征，进行预测并使用原始网格网格绘制预测。希望以下内容能满足您的需求：

import matplotlib.pyplot as plt
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
from sklearn.preprocessing import PolynomialFeatures
from sklearn import linear_model

# The training set
datapoints = np.array([
    [12.5,70,81.32], [25,70,88.54], [37.5,70,67.58], [50,70,55.32], 
    [62.5,70,56.84], [77,70,49.52], [0,11.5,71.32], [77,57.5,67.20], 
    [0,23,58.54], [25,46,51.32], [37.5,46,49.52], [0,34.5,63.22], 
    [25,34.5,48.32], [37.5,34.5,82.30], [50,34.5,56.42], [77,34.5,48.32], 
    [37.5,23,67.32], [0,46,64.20], [77,11.5,41.89], [77,46,55.54], 
    [77,23,52.22], [0,57.5,93.72], [0,70,98.20], [77,0,42.32]
    ])
X = datapoints[:,0:2]
Y = datapoints[:,-1]
# 5 degree polynomial features
deg_of_poly = 5
poly = PolynomialFeatures(degree=deg_of_poly)
X_ = poly.fit_transform(X)
# Fit linear model
clf = linear_model.LinearRegression()
clf.fit(X_, Y)

# The test set, or plotting set
N = 20
Length = 70
predict_x0, predict_x1 = np.meshgrid(np.linspace(0, Length, N), 
                                     np.linspace(0, Length, N))
predict_x = np.concatenate((predict_x0.reshape(-1, 1), 
                            predict_x1.reshape(-1, 1)), 
                           axis=1)
predict_x_ = poly.fit_transform(predict_x)
predict_y = clf.predict(predict_x_)

# Plot
fig = plt.figure(figsize=(16, 6))
ax1 = fig.add_subplot(121, projection='3d')
surf = ax1.plot_surface(predict_x0, predict_x1, predict_y.reshape(predict_x0.shape), 
                        rstride=1, cstride=1, cmap=cm.jet, alpha=0.5)
ax1.scatter(datapoints[:, 0], datapoints[:, 1], datapoints[:, 2], c='b', marker='o')

ax1.set_xlim((70, 0))
ax1.set_ylim((0, 70))
fig.colorbar(surf, ax=ax1)
ax2 = fig.add_subplot(122)
cs = ax2.contourf(predict_x0, predict_x1, predict_y.reshape(predict_x0.shape))
ax2.contour(cs, colors='k')
fig.colorbar(cs, ax=ax2)
plt.show()

【讨论】：

谢谢罗，这似乎是对的。我相信我没有正确地重塑矩阵。再次感谢你。然而这实际上与问题无关，与论文相比，我似乎得到了非常不同的结果，你能告诉我如何解释训练模型（clf）返回的系数吗？我的意思是像 a*x^2*y^1+ ... 等等。我知道转换将数据点转换为相关系数，我无法解释等式。
@arshh 很抱歉，我的回答与您的问题完全无关。你可以用clf.coef_检查系数，用poly.get_feature_names()检查相应的特征。然后你应该能够找出诸如“a*x^2*y^1+ ...等”之类的东西。对于这个特定的配件。至于为什么它与论文不同以及如何解释方程。再说一次，很抱歉我不知道。
@arshh 我的目的是回答您关于“是否有任何可能的方法来绘制此多项式的网格点？”的问题。并尝试给你一些接近“图 5 和 6”的东西。我不确定我是否无法理解您的真实问题。我建议您发布另一个问题，也许更清楚您的问题是什么......
@Luo，Lou，你误会了我，当我说“但这实际上与问题根本无关”时，我的意思是我要问你的（关于系数）是与我发布的问题无关。
你实际上给了我答案。我在问另一个离题的问题，同时，我很小心地说它与同一个问题无关。再次感谢你:)