使用梯度下降和 OCTAVE 的逻辑回归答案

【问题标题】：Logistic Regression using Gradient Descent with OCTAVE使用梯度下降和 OCTAVE 的逻辑回归
【发布时间】：2014-08-26 18:28:27
【问题描述】：

我已经完成了 Andrew 教授的几门机器学习课程，并查看了使用牛顿法进行逻辑回归的成绩单。然而，在使用梯度下降实现逻辑回归时，我面临着某些问题。

生成的图不是凸的。

我的代码如下：

我正在使用等式的矢量化实现。

%1. The below code would load the data present in your desktop to the octave memory 
x=load('ex4x.dat');
y=load('ex4y.dat');

%2. Now we want to add a column x0 with all the rows as value 1 into the matrix.
%First take the length
m=length(y);
x=[ones(m,1),x];

alpha=0.1;
max_iter=100;
g=inline('1.0 ./ (1.0 + exp(-z))');

theta = zeros(size(x(1,:)))';   % the theta has to be a 3*1 matrix so that it can multiply by x that is m*3 matrix
j=zeros(max_iter,1);            % j is a zero matrix that is used to store the theta cost function j(theta)

for num_iter=1:max_iter
    %  Now we calculate the hx or hypothetis, It is calculated here inside no. of iteration because the hupothesis has to be calculated for new theta for every iteration
         z=x*theta;
         h=g(z);     % Here the effect of inline function we used earlier will reflect


     j(num_iter)=(1/m)*(-y'* log(h) - (1 - y)'*log(1-h)) ;    % This formula is the vectorized form of the cost function J(theta) This calculates the cost function
         j       
         grad=(1/m) *  x' * (h-y);     % This formula is the gradient descent formula that calculates the theta value.  
         theta=theta - alpha .* grad;          % Actual Calculation for theta
         theta
 end

每个说的代码没有给出任何错误，但没有产生正确的凸图。

如果有人能指出错误或分享导致问题的原因，我会很高兴。

谢谢

【问题讨论】：

你能展示它产生的图表吗？
我已经添加了图表，如果你把点连接起来，你会发现梯度下降反复上下变化，它应该减少，一段时间后应该保持不变，对于 j(min j) 应确定θ。当我使用牛顿法对 j 方法采用相同的方法时，我没有得到它，我只在 10 次迭代时得到正确的输出。谢谢你的帮助！！
关于图：- X 轴是迭代次数，Y 轴是 j(theta) 成本函数。

标签： machine-learning octave logistic-regression gradient-descent

【解决方案1】：

您需要研究的两件事：

机器学习涉及从数据中学习模式。如果您的文件 ex4x.dat 和 ex4y.dat 是随机生成的，则它不会有您可以学习的模式。
您使用了 g、h、i、j 等变量，这使调试变得困难。由于它是一个非常小的程序，最好重写它。

这是我给出凸图的代码

clc; clear; close all;

load q1x.dat;
load q1y.dat;

X = [ones(size(q1x, 1),1) q1x];
Y = q1y;

m = size(X,1);
n = size(X,2)-1;

%initialize
theta = zeros(n+1,1);
thetaold = ones(n+1,1);

while ( ((theta-thetaold)'*(theta-thetaold)) > 0.0000001 )
    %calculate dellltheta
    dellltheta = zeros(n+1,1);
    for j=1:n+1,
        for i=1:m,
            dellltheta(j,1) = dellltheta(j,1) + [Y(i,1) - (1/(1 + exp(-theta'*X(i,:)')))]*X(i,j);
        end;
    end;
    %calculate hessian
    H = zeros(n+1, n+1);
    for j=1:n+1,
        for k=1:n+1,
                for i=1:m,
                    H(j,k) = H(j,k) -[1/(1 + exp(-theta'*X(i,:)'))]*[1-(1/(1 + exp(-theta'*X(i,:)')))]*[X(i,j)]*[X(i,k)];
                end;
        end;
    end;
    thetaold = theta;
    theta = theta - inv(H)*dellltheta;
    (theta-thetaold)'*(theta-thetaold)
end

迭代后我得到以下错误值：

2.8553
0.6596
0.1532
0.0057
5.9152e-06
6.1469e-12

绘制时的样子：

【讨论】：

嗨 Rishi，我使用牛顿法（粗麻布）尝试了上述问题，它也对我有用。我面临的挑战是通过使用梯度下降来完成凸图。您能否使用梯度下降尝试同样的问题。