【问题标题】:Questions of xgboost with Rxgboost 与 R 的问题
【发布时间】:2016-03-27 07:20:05
【问题描述】:

我使用xgboost 进行逻辑回归。我按照from 的步骤操作,但遇到了两个问题。数据集位于here

首先,当我运行以下代码时:

bst <- xgboost(data = sparse_matrix, label = output_vector,nrounds = 39,param)

然后,我得到了

 [0]train-rmse:0.350006
 [1]train-rmse:0.245008
 [2]train-rmse:0.171518
 [3]train-rmse:0.120065
 [4]train-rmse:0.084049
 [5]train-rmse:0.058835
 [6]train-rmse:0.041185
 [7]train-rmse:0.028830
 [8]train-rmse:0.020182
 [9]train-rmse:0.014128
[10]train-rmse:0.009890
[11]train-rmse:0.006923
[12]train-rmse:0.004846
[13]train-rmse:0.003392
[14]train-rmse:0.002375
[15]train-rmse:0.001662
[16]train-rmse:0.001164
[17]train-rmse:0.000815
[18]train-rmse:0.000570
[19]train-rmse:0.000399
[20]train-rmse:0.000279
[21]train-rmse:0.000196
[22]train-rmse:0.000137
[23]train-rmse:0.000096
[24]train-rmse:0.000067
[25]train-rmse:0.000047
[26]train-rmse:0.000033
[27]train-rmse:0.000023
[28]train-rmse:0.000016
[29]train-rmse:0.000011
[30]train-rmse:0.000008
[31]train-rmse:0.000006
[32]train-rmse:0.000004
[33]train-rmse:0.000003
[34]train-rmse:0.000002
[35]train-rmse:0.000001
[36]train-rmse:0.000001
[37]train-rmse:0.000001
[38]train-rmse:0.000000

train-rmse终于等于0了!这正常吗?通常,我知道train-rmse 不能等于 0。那么,我的问题在哪里?

第二,当我跑步时

importance <- xgb.importance(sparse_matrix@Dimnames[[2]], model = bst)

然后,我得到一个错误:

eval(expr, envir, enclos) 中的错误:找不到对象“是”。

我不知道是什么意思,也许第一个问题会引出第二个问题。

library(data.table)
train_x<-fread("train_x.csv")
str(train_x)
train_y<-fread("train_y.csv")
str(train_y)
train<-merge(train_y,train_x,by="uid")
train$uid<-NULL
test<-fread("test_x.csv")
require(xgboost)
require(Matrix)
sparse_matrix <- sparse.model.matrix(y~.-1, data = train)
head(sparse_matrix)
output_vector = train[,y] == "Marked"
param <- list(objective = "binary:logistic", booster = "gblinear",
          nthread = 2, alpha = 0.0001,max.depth = 4,eta=1,lambda = 1)
bst <- xgboost(data = sparse_matrix, label = output_vector,nrounds = 39,param)
importance <- xgb.importance(sparse_matrix@Dimnames[[2]], model = bst)

【问题讨论】:

    标签: r regression logistic-regression xgboost


    【解决方案1】:

    也许这有帮助。当标签的变化为零时,我经常会遇到同样的错误。使用 xgboost 的当前 CRAN 版本,它已经有点老了(0.4.4)。 xgb.train 很乐意接受这一点(显示 0.50 AUC),但随后在调用 xgb.importance 时显示错误。

    干杯

    奥托

    [0] train-auc:0.500000  validate-auc:0.500000
    [1] train-auc:0.500000  validate-auc:0.500000
    [2] train-auc:0.500000  validate-auc:0.500000
    [3] train-auc:0.500000  validate-auc:0.500000
    [4] train-auc:0.500000  validate-auc:0.500000
    
    [1] "XGB error: Error in eval(expr, envir, enclos): object 'Yes' not found\n"
    

    【讨论】:

    • 当我的 train-auc 为 0.5000 时,我看到了同样的错误,因此我的预测没有差异。
    【解决方案2】:

    我遇到了同样的问题(错误在 eval(expr, envir, enclos) : object 'Yes' not found.),原因如下:

    我试过了

    dt = data.table(x = runif(10), y = 1:10, z = 1:10)
    label = as.logical(dt$z)
    train = dt[, z := NULL]
    trainAsMatrix = as.matrix(train)
    label = as.matrix(label)
    
    bst <- xgboost(data = trainAsMatrix, label = label, max.depth = 8,
                   eta = 0.3, nthread = 2, nround = 50, objective = "reg:linear")
    bst$featureNames = names(train)
    xgb.importance(model = bst)
    

    问题出在线路

    label = as.logical(dt$z)
    

    我把这条线放在那里是因为我上次使用 xgboost 时,我想预测一个分类变量。现在既然我想做回归,它应该是:

    label = dt$z
    

    也许类似的事情会导致你的情况出现问题?

    【讨论】:

      猜你喜欢
      • 2017-03-08
      • 2017-08-06
      • 2016-02-12
      • 2020-10-04
      • 2017-12-04
      • 2016-07-17
      • 2016-11-20
      • 1970-01-01
      • 2016-08-26
      相关资源
      最近更新 更多