【问题标题】:Rpart R Decision Tree Score [duplicate]Rpart R决策树分数[重复]
【发布时间】:2021-05-30 04:07:26
【问题描述】:

在使用 SkLearn 的 Python 中,您可以使用以下方法在决策树上创建和接收分数:

tr = tree.DecisionTreeClassifier(random_state=rseed, min_samples_split=2, ccp_alpha=0.005)
model_tree = tr.fit(train_features, train_outputs)

print(f'Model Train Accuracy: {model_tree.score(train_features, train_outputs)}')
print(f'Model Test Accuracy: {model_tree.score(test_features, test_outputs)}')

以上产生

Model Train Accuracy: 0.5942
Model Test Accuracy: 0.4933

如何使用 R 的 Rpart 在 R 中(在训练和测试数据上)获得相似的分数?

【问题讨论】:

标签: python r scikit-learn decision-tree rpart


【解决方案1】:

简而言之:

  1. 如下图计算错误率
  2. 确保在python和R中使用相同的参数和控制参数(见https://www.rdocumentation.org/packages/rpart/versions/4.1-15/topics/rpart.control
model_tree <- rpart(Response ~ Predictor1 + PredictorX,
                    data = train, method = "class",
                    control = list(cp = 0.005, minsplit = 2, ...))

pred_train <- predict(model_tree, type = "class")
pred_test <- predict(model_tree, newdata = test, type = "class")

# error rate / accuracy (train set)
mean(pred_train != train$Response)

# error rate / accuracy (test set)
mean(pred_test != test$Response)

【讨论】:

    猜你喜欢
    • 2013-02-09
    • 2015-01-08
    • 2020-10-05
    • 2017-09-08
    • 2015-07-14
    • 2013-05-01
    • 2016-07-24
    • 2015-06-28
    • 2015-06-16
    相关资源
    最近更新 更多