【发布时间】:2018-08-17 05:29:11
【问题描述】:
这有点远,但我想知道是否有人可以看看这个。我在这里正确地为线性回归进行批量梯度下降吗? 它给出了单个自变量和截距的预期答案,而不是多个自变量的预期答案。
/**
* (using Colt Matrix library)
* @param alpha Learning Rate
* @param thetas Current Thetas
* @param independent
* @param dependent
* @return new Thetas
*/
public DoubleMatrix1D descent(double alpha,
DoubleMatrix1D thetas,
DoubleMatrix2D independent,
DoubleMatrix1D dependent ) {
Algebra algebra = new Algebra();
// ALPHA*(1/M) in one.
double modifier = alpha / (double)independent.rows();
//I think this can just skip the transpose of theta.
//This is the result of every Xi run through the theta (hypothesis fn)
//So each Xj feature is multiplied by its Theata, to get the results of the hypothesis
DoubleMatrix1D hypothesies = algebra.mult( independent, thetas );
//hypothesis - Y
//Now we have for each Xi, the difference between predictect by the hypothesis and the actual Yi
hypothesies.assign(dependent, Functions.minus);
//Transpose Examples(MxN) to NxM so we can matrix multiply by hypothesis Nx1
DoubleMatrix2D transposed = algebra.transpose(independent);
DoubleMatrix1D deltas = algebra.mult(transposed, hypothesies );
// Scale the deltas by 1/m and learning rate alhpa. (alpha/m)
deltas.assign(Functions.mult(modifier));
//Theta = Theta - Deltas
thetas.assign( deltas, Functions.minus );
return( thetas );
}
【问题讨论】:
-
至于算法步骤和数学,我看不出有什么问题。我不熟悉
Colt库,但我认为函数名称具有表达性并且含义很清楚。我假设你有independent矩阵的第一列一个向量来估计截距。multiple regression的值有何不同? -
第一列是 1 的截距。我认为这可能是正确的,并且我的测试数据中遇到了共线性。我创建了测试数据,因此我有 x1 和 x2,而 x2 只是 2*x1。我将因变量设置为 y= .5*x1 + (1/3)*x2。它收敛了,但没有达到我的预期。
-
例如,在上述情况下,我得到 .6333(x1) 和 .2666(x2) 的 Thetas。它确实正确地挑选出我放入函数的任何截距..(例如 y= .5*x1 + (1/3)*x2 + 10)。如果我在同一个数据集上使用 WEKA,它会自动处理共线性,并且只执行 1.1666 * x1。
标签: java machine-learning linear-regression