【问题标题】:Extracting variable names from decision tree [duplicate]从决策树中提取变量名称
【发布时间】:2019-10-08 01:10:47
【问题描述】:

所以我在 R 中使用 tree 包构建了一个决策树,并在树上运行 summary() 函数给了我:

Classification tree:
tree(formula = High temperature ~ ., data = summer.train)
Variables actually used in tree construction:
[1] "Humidity"      "Cloudy"   "Airy" "Dry"   
"Windy"
Number of terminal nodes:  12
Residual mean deviance:  0.3874 = 377.7 / 975 
Misclassification error rate: 0.08909 = 89 / 999 

我想根据上面的汇总函数获取树构造使用的变量,“airy”,“dry”等。有什么办法可以做到吗?

【问题讨论】:

    标签: r


    【解决方案1】:

    所以它是链接到:

    Used Variables in Tree

    确实,该解决方案对我有用,我使用著名的垃圾邮件数据集对其进行了测试:

    library(kernlab)
    library(tree)
    
    data(spam)
    
    spam_tree_def <- tree(type~.,data=spam)
    summary(spam_tree_def)
    

    总结结果:

    Classification tree:
    tree(formula = type ~ ., data = spam)
    Variables actually used in tree construction:
     [1] "charDollar"      "remove"          "charExclamation" "hp"              "capitalLong"     "our"            
     [7] "capitalAve"      "free"            "george"          "edu"            
    Number of terminal nodes:  13 
    Residual mean deviance:  0.4879 = 2238 / 4588 
    Misclassification error rate: 0.08259 = 380 / 4601 
    

    提取你想要的东西的方法:

    as.character(summary(spam_tree_def)$used)
    
    [1] "charDollar"      "remove"          "charExclamation" "hp"              "capitalLong"     "our"            
     [7] "capitalAve"      "free"            "george"          "edu" 
    

    【讨论】:

    • 只是普通的树包
    猜你喜欢
    • 2018-10-21
    • 2023-03-22
    • 2015-01-13
    • 2021-03-07
    • 2021-12-16
    • 2013-12-12
    相关资源
    最近更新 更多