【问题标题】:Plotting classification decision boundary line based on perceptron coefficients基于感知器系数绘制分类决策边界线
【发布时间】:2016-10-27 16:51:39
【问题描述】:

这实际上是this question 的重复。但是,我想问一个非常具体的问题,关于根据我通过基本的“手动”编码实验获得的感知器系数绘制决策边界线。如您所见,从逻辑回归中提取的系数会产生一条不错的决策边界线:

基于glm() 结果:

(Intercept)       test1       test2 
   1.718449    4.012903    3.743903 

感知器实验的系数完全不同:

     bias     test1     test2 
 9.131054 19.095881 20.736352 

为了方便回答,here is the data,这里是代码:

# DATA PRE-PROCESSING:
dat = read.csv("perceptron.txt", header=F)
dat[,1:2] = apply(dat[,1:2], MARGIN = 2, FUN = function(x) scale(x)) # scaling the data
data = data.frame(rep(1,nrow(dat)), dat) # introducing the "bias" column
colnames(data) = c("bias","test1","test2","y")
data$y[data$y==0] = -1 # Turning 0/1 dependent variable into -1/1.
data = as.matrix(data) # Turning data.frame into matrix to avoid mmult problems.

# PERCEPTRON:
set.seed(62416)
no.iter = 1000                           # Number of loops
theta = rnorm(ncol(data) - 1)            # Starting a random vector of coefficients.
theta = theta/sqrt(sum(theta^2))         # Normalizing the vector.
h = theta %*% t(data[,1:3])              # Performing the first f(theta^T X)

for (i in 1:no.iter){                    # We will recalculate 1,000 times
  for (j in 1:nrow(data)){               # Each time we go through each example.
      if(h[j] * data[j, 4] < 0){         # If the hypothesis disagrees with the sign of y,
      theta = theta + (sign(data[j,4]) * data[j, 1:3]) # We + or - the example from theta.
      }
      else
      theta = theta                      # Else we let it be.
  }
  h = theta %*% t(data[,1:3])            # Calculating h() after iteration.
}
theta                                    # Final coefficients
mean(sign(h) == data[,4])                # Accuracy

问题:如果我们只有感知器系数,如何绘制边界线(就像我上面使用逻辑回归系数所做的那样)?

【问题讨论】:

    标签: r plot machine-learning


    【解决方案1】:

    嗯...事实证明它与case of logistic regression中的完全一样,尽管系数有很大不同:选择横坐标的最小值和最大值(测试1),添加一点边距,然后计算在决策边界(0 = theta_o + theta_1 test1 + theta_2 test2时)对应的测试2值,并在点之间画线:

    palette(c("tan3","purple4"))
    plot(test2 ~ test1, col = as.factor(y), pch = 20, data=data,
         main="College admissions")
    (x = c(min(data[,2])-.2,  max(data[,2])+ .2))
    (y = c((-1/theta[3]) * (theta[2] * x + theta[1])))
    lines(x, y, lwd=3, col=rgb(.7,0,.2,.5))
    

    【讨论】:

      【解决方案2】:

      计算感知器权重,以便当 theta^T X > 0 时将其分类为正,当 theta^T X

      除了现在的 sigmoid(theta^T X) > 0.5 之外,逻辑回归也适用相同的逻辑。

      【讨论】:

      • sigmoid 函数的通常阈值是 0.5,对应于 Theta^T X 的过零。
      • 你是对的。我忘记了 sigmoid 从 0 到 1。
      • 我认为我的问题非常具体,我留下了答案而不是删除整个帖子,以防其他人可以从中受益(在此期间,我已经看到了类似的问题) .
      猜你喜欢
      • 2018-07-16
      • 2018-11-08
      • 2016-06-13
      • 2019-10-03
      • 2013-10-04
      • 1970-01-01
      • 2017-03-31
      • 2013-10-04
      相关资源
      最近更新 更多