【发布时间】:2018-01-15 16:22:32
【问题描述】:
我想使用下面的代码在插入符号中应用加权观察:
model_weights <- ifelse(train$y == 0,
(1/table(train$y)[1]) * 0.5,
(1/table(train$y)[2]) * 0.5)
xgbT <- train(x = as.matrix(train[,-21]), y = make.names(as.factor(train$y)),
method = "xgbTree",
trControl = cctrl1,
metric = "MCC",
maximize = TRUE,
weights = model_weights,
preProc = c("center", "scale"),
tuneGrid = expand.grid(nrounds = c(150), #number of trees
max_depth = c(7), #max tree depth
eta = c(0.03), #learning rate
gamma = c(0.3), #min split loss
colsample_bytree = c(0.7),
min_child_weight = c(10, 1, 5), #min number of instances in the leaf
subsample = c(0.6)), #subsample ratio of the training instance
early_stop_round = c(3), #if no improvements over specified rounds
objective = c("binary:logistic"),
silent = 0)
但是,它给了我这个错误:Error in model.frame.default(formula = .outcome ~ ., data = dat, weights = wts) :
variable lengths differ (found for '(weights)')
虽然我已经检查过它们的长度与下面的代码相同:
> table(model_weights)
model_weights
0.0000277654375832963 0.000231481481481481
18008 2160
> table(train$y)
0 1
18008 2160
知道如何解决这个问题吗?
注意:我可以在没有 weights 参数的情况下运行 train 函数。
【问题讨论】:
-
@missuse 是的,我在这里检查过:topepo.github.io/caret/…
-
哪一个?请分享链接
-
对不起,早上...我以为你链接了它:Here 它是 -
test_class_cv_form_weight示例 -
当我在
topepo's code你的model_weights上运行时,我仍然得到结果。它必须在您的数据中。也许尝试将数据减少到可以重现问题的较小样本并提供dput或作为下载链接。 -
尝试使用
weights = model_weights/max(model_weights)如果我没有提供输出,但模型是无意义的。