【问题标题】:Extract Matrix from C5.0 Model从 C5.0 模型中提取矩阵
【发布时间】:2017-02-08 20:04:09
【问题描述】:

在对数据使用 C 5.0 算法后,

a <- C5.0(FACTOR~.,data = i_data,trials=10,costs = matrix(c(0,1,4,0), nrow = 2))

当我找到模型的摘要时,

summary(a)

我得到了这样的东西,

.
.
.
.

SubTree [S1]

Col_L > 89: N (195.6/6.5)
Col_L <= 89:
:...Col_Q > 4657: Y (66.6/34)
    Col_Q <= 4657:
    :...Col_F > 15: Y (117.6/75)
        Col_F <= 15:
        :...Col_C <= 5.6926: N (2040.5/266.7)
            Col_C > 5.6926: Y (148.7/104.4)

SubTree [S2]

Col_E > 14: N (2523.3/176.8)
Col_E <= 14:
:...Col_G > 5: N (83.4/1.4)
    Col_G <= 5:
    :...Col_O > 6880: Y (41.8/22)
        Col_O <= 6880:
        :...Col_G <= 3: N (1939.9/230.1)
            Col_G > 3: Y (92.7/64.5)


Evaluation on training data (53392 cases):

Trial          Decision Tree       
-----     -----------------------  
  Size      Errors   Cost  

   0        87 16173(30.3%)   0.35
   1        25 14071(26.4%)   0.43
   2        48 15295(28.6%)   0.74
   3        50 14672(27.5%)   0.48
   4        43 16765(31.4%)   0.55
   5        52 16346(30.6%)   0.98
   6        58 18277(34.2%)   0.52
   7        65 13940(26.1%)   0.64
   8        63 14020(26.3%)   0.42
   9        57 13517(25.3%)   0.45
   boost           13284(24.9%)   0.39   <<


   (a)   (b)    <-classified as
  ----  ----
 15848 10848    (a): class N
  2436 24260    (b): class Y


Attribute usage:

100.00% Col_A
100.00% Col_B
100.00% Col_C
100.00% Col_D
100.00% Col_E
 99.79% Col_F
 99.63% Col_G
 76.66% Col_H
 76.55% Col_I
 75.64% Col_J
 70.22% Col_K
 65.15% Col_L
 59.01% Col_M
 58.94% Col_N
 42.54% Col_O
 33.01% Col_P
 21.73% Col_Q
 16.58% Col_R
 12.69% Col_S
  8.43% Col_T

有没有办法提取这个

 (a)   (b)    <-classified as
  ----  ----
 15848 10848    (a): class N
  2436 24260    (b): class Y

从上面的总结中,以便我可以在另一个 R 实例中加载它?

【问题讨论】:

    标签: r


    【解决方案1】:

    C5.0 将其保存为文本,但您可以这样导出:

    #example from ?C5.0
    data(churn)
    treeModel <- C5.0(x = churnTrain[, -20], y = churnTrain$churn)
    treeModel
    #saves summary in b
    #b$output is the printed text
    b <- summary(treeModel)
    
    #get position of '(a)'
    pos1 <- gregexpr(pattern ='\\(a\\)', b$output)[[1]][1]
    #get position of 'class no' - in your case should be class Y
    pos2 <- gregexpr(pattern ='class no', b$output)[[1]][1]
    #substring using the above
    text <- substr(b$output, pos1, pos2)
    
    #print
    cat(text)
    

    输出:

    (a)   (b)    <-classified as
    ----  ----
    365   118    (a): class yes
     18  2832    (b): c
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2016-04-06
      • 2021-10-14
      • 1970-01-01
      • 1970-01-01
      • 2016-01-09
      • 2021-03-21
      相关资源
      最近更新 更多