尝试在python中实现线性回归答案

【问题标题】：Trying to implement linear regression in python尝试在python中实现线性回归
【发布时间】：2014-11-01 02:06:49
【问题描述】：

我在 Python 中实现线性回归，我认为我在将矩阵转换为 numpy 数组时做错了，但似乎无法弄清楚。任何帮助将不胜感激。

我正在从一个包含 100 列的 csv 文件加载数据。 y 是最后一列。我没有使用 col 1 和 2 进行回归。

communities=np.genfromtxt("communities.csv", delimiter = ",", dtype=float)
xdata = communities[1:,2:99]
x = np.array([np.concatenate((v,[1]))for v in xdata])
y = communities[1:,99]

函数定义

def standRegress(xArr, yArr):
    xMat = mat(xArr); yMat = mat(yArr).T
    xTx = xMat.T*xMat
    if linalg.det(xTx)==0.0:
        print"singular matrix"
        return
    ws = xTx.I*(xMat.T*yMat)
    return ws

调用函数

w = standRegress(x,y)
xMat = mat(x) #shape(1994L,98L)
yMat = mat(y) #shape (1L, 1994L)
yhat = xMat*w #shape (1994L, 1L)

接下来我正在尝试计算 RMSE，这就是我遇到问题的地方

yMatT = yMat.T #shape(1994L, 1L)
err = yhat - yMatT #shape(1994L, 1L)
error = np.array(err)
total_error = np.dot(error,error)
rmse = np.sqrt(total_error/len(p))

我在做点积时出错，因此无法计算 rmse。如果有人能帮我找出我的错误，我将不胜感激。

Error: 
 ---> 11 np.dot(error,error)
 12 #test = (error)**2
 13 #test.sum()/len(y)
 ValueError: matrices are not aligned

【问题讨论】：

您可以编辑您的问题并包含您收到的具体错误消息吗？
当您使用numpy 时，只是想知道为什么您不使用linalg 有什么特别的原因？
@Anzel，没想过使用linalg。请您指导如何使用它。
@nasiajaffri，看看this numpy doc
@Michael0x2a，我已经编辑了这个问题。请立即查看。

标签： python regression

【解决方案1】：

我不太确定最后一个 dot 应该做什么。但是您不能以这种方式将error 与自身多重。 dot 进行矩阵乘法，因此尺寸必须对齐。

例如，请参阅以下示例：

import numpy as np
A = np.ones((3, 4))
B = np.ones((3, 4))
print np.dot(A, B)

这会产生错误ValueError: matrices are not aligned。

然而，有可能的是：

print np.dot(A.T, B)

输出：

[[ 3.  3.  3.  3.]
 [ 3.  3.  3.  3.]
 [ 3.  3.  3.  3.]
 [ 3.  3.  3.  3.]]

在您的示例中，error 只是一个列向量 - 但存储为二维数组：

A = np.ones((3, 1))
B = np.ones((3, 1))
print np.dot(A, B)

同样的错误。

因此，您可以转置一个参数 - 如上所示 - 或将一列提取为一维数组：

print np.dot(A[:, 0], B[:, 0])

输出：

3.0

【讨论】：

是的，你是对的，但是 err 应该是 1994 行，但只有 1 列。我不确定在点积之前我做错了什么。
@nasiajaffri：哦，我明白了。我相应地编辑了我的答案。
Error: matrices are not aligned
还有 - 来自np.info(np.dot) - ...Raises ------ ValueError If the last dimension of `a` is not the same size as the second-to-last dimension of `b`....