【问题标题】:How to add ano more measure of interest in the arules package如何在 arules 包中添加更多的兴趣度量
【发布时间】:2026-02-15 04:45:02
【问题描述】:

我想添加两个额外的措施作为 arules 包中“检查”功能的结果。他们是Kulczynski和不平衡比率。 你能帮我提供信息,在哪里可以找到检查功能的代码以及如何修改它。

谢谢

【问题讨论】:

    标签: arules


    【解决方案1】:

    您需要做的就是向质量数据框架添加额外的列。 Inspect 会自动选择这些。以下是来自? interestMeasure 的示例:

    data("Income")
    rules <- apriori(Income)
    
    ## calculate a single measure and add it to the quality slot
    quality(rules) <- cbind(quality(rules), 
      hyperConfidence = interestMeasure(rules, method = "hyperConfidence",
         transactions = Income))
    
    inspect(head(sort(rules, by = "hyperConfidence")))
    
      lhs                                 rhs                                support confidence     lift hyperConfidence
    1 {ethnic classification=hispanic} => {education=no college graduate} 0.1096568  0.8636884 1.224731               1
    2 {dual incomes=no}                => {marital status=married}        0.1400524  0.9441176 2.447871               1
    3 {occupation=student}             => {marital status=single}         0.1449971  0.8838652 2.160490               1
    4 {occupation=student}             => {age=14-34}                     0.1592496  0.9707447 1.658345               1
    5 {occupation=student}             => {dual incomes=not married}      0.1535777  0.9361702 1.564683               1
    6 {occupation=student}             => {income=$0-$40,000}             0.1381617  0.8421986 1.353027               1
    

    【讨论】:

    • 感谢您的信息。要计算 Kulczynski 和不平衡比率,我需要对每个有趣的交易使用单个项目集支持。例如,不平衡比率等于 |sup(A)-sup(B)|/(sup(A)+sup(B) -sup(A->B))
    【解决方案2】:

    不平衡很直接:

    library(arules)
    data("Income")
    rules <- apriori(Income)
    
    suppA <- support(lhs(rules), trans = Income)
    suppB <- support(rhs(rules), trans = Income)
    suppAB <- quality(rules)$supp
    quality(rules)$imbalance <- abs(suppA - suppB)/(suppA + suppB - suppAB)
    
    inspect(head(rules))
      lhs                                 rhs                               support confidence     lift  imbalance
    1 {}                               => {language in home=english}      0.9128854  0.9128854 1.000000 0.03082862
    2 {occupation=clerical/service}    => {language in home=english}      0.1127109  0.9292566 1.017933 0.69021050
    3 {ethnic classification=hispanic} => {education=no college graduate} 0.1096568  0.8636884 1.224731 0.61395923
    4 {dual incomes=no}                => {marital status=married}        0.1400524  0.9441176 2.447871 0.35210356
    5 {dual incomes=no}                => {language in home=english}      0.1364165  0.9196078 1.007364 0.63837280
    6 {occupation=student}             => {marital status=single}         0.1449971  0.8838652 2.160490 0.34123127
    

    Kulczynski 测度 1/2(P(A|B)+P(B|A)) 有点棘手。 P(A|B) 只是 A->B 的置信度。但是,对于 P(B|A),我们需要 B->A 的置信度。所以我们需要创建一套新的规则,左右两边互换并计算置信度:

     confAB <- quality(rules)$conf
     BArules <- new("rules", lhs = rhs(rules), rhs = lhs(rules))
     confBA <- interestMeasure(BArules, method = "confidence", trans = Income)
     quality(rules)$kulczynski <- .5*(confAB + confBA)
    
     inspect(head(rules))
        lhs                                 rhs                               support confidence     lift  imbalance kulczynski
      1 {}                               => {language in home=english}      0.9128854  0.9128854 1.000000 0.03082862  0.9564427
      2 {occupation=clerical/service}    => {language in home=english}      0.1127109  0.9292566 1.017933 0.69021050  0.5263616
      3 {ethnic classification=hispanic} => {education=no college graduate} 0.1096568  0.8636884 1.224731 0.61395923  0.5095922
      4 {dual incomes=no}                => {marital status=married}        0.1400524  0.9441176 2.447871 0.35210356  0.6536199
      5 {dual incomes=no}                => {language in home=english}      0.1364165  0.9196078 1.007364 0.63837280  0.5345211
      6 {occupation=student}             => {marital status=single}         0.1449971  0.8838652 2.160490 0.34123127  0.6191456
    

    【讨论】:

    • 只是一个更新。不平衡、Kuliczynski 度量和许多其他度量现在可在函数interestMeasure 中使用。
    • 这些度量是否可用于序列规则(通过 arulesSequences 包?