【发布时间】:2020-05-01 04:08:47
【问题描述】:
我正在使用 Titanic 数据集试验欧洲防风草包。
library(titanic)
library(dplyr)
library(tidymodels)
library(rattle)
library(rpart.plot)
library(RColorBrewer)
train <- titanic_train %>%
mutate(Survived = factor(Survived),
Sex = factor(Sex),
Embarked = factor(Embarked))
test <- titanic_test %>%
mutate(Sex = factor(Sex),
Embarked = factor(Embarked))
spec_obj <-
decision_tree(mode = "classification") %>%
set_engine("rpart")
spec_obj
fit_obj <-
spec_obj %>%
fit(Survived ~ Pclass + Sex + Age + SibSp + Parch + Fare + Embarked, data = train)
fit_obj
fancyRpartPlot(fit_obj$fit)
pred <-
fit_obj %>%
predict(new_data = test)
pred
假设我想在我的模型函数中添加一些参数。
spec_obj <- update(spec_obj, min_n = 50, cost_complexity = 0)
fit_obj <-
spec_obj %>%
fit(Survived ~ Pclass + Sex + Age + SibSp + Parch + Fare + Embarked, data = train)
fit_obj
fancyRpartPlot(fit_obj$fit)
有什么方法可以避免在fit() 函数中再次指定模型和数据集?
==============编辑================
我发现您可以将公式保存在变量中:
f <- as.formula("Survived ~ Pclass + Sex + Age + SibSp + Parch + Fare + Embarked")
fit_obj <-
spec_obj %>%
fit(f, data = train)
fit_obj
还有更好的办法吗?
【问题讨论】:
标签: r tidymodels