【问题标题】:How to count unique values over multiple columns using R?如何使用 R 计算多列的唯一值?
【发布时间】:2020-05-31 10:59:30
【问题描述】:

假设我有以下df:

1               2                    3
home, work      work, home           home, work
leisure, work   work, home, leisure  work, home
home, leisure   work, home           home, work

我想计算整个 data.frame 上的所有唯一变量(不是按列或行,我对单元格值感兴趣)

所以输出应该是这样的:

                    freq
home, work          3
leisure, work       1
home, leisure       1
work, home          3
work, home, leisure 1

我还没有找到方法来做到这一点。 count() 函数似乎只适用于单列。

非常感谢您的帮助:)

【问题讨论】:

    标签: r dataframe count


    【解决方案1】:

    您可以 unlist 并使用 table 来计算基数 R :

    stack(table(unlist(df)))
    #Same as
    #stack(table(as.matrix(df)))
    

    如果您更喜欢tidyverse,请使用pivot_longercount 获取长格式数据。

    df %>%
      tidyr::pivot_longer(cols = everything()) %>%
      dplyr::count(value)
    
    # A tibble: 5 x 2
    #  value                 n
    #  <chr>             <int>
    #1 home,leisure          1
    #2 home,work             3
    #3 leisure,work          1
    #4 work,home             3
    #5 work,home,leisure     1
    

    数据

    df <- structure(list(X1 = c("home,work", "leisure,work", "home,leisure"
    ), X2 = c("work,home", "work,home,leisure", "work,home"), X3 = c("home,work", 
    "work,home", "home,work")), class = "data.frame", row.names = c(NA, -3L))
    

    【讨论】:

    • 哇哦,真快!谢谢!:D 我会尽快接受答案
    【解决方案2】:

    使用tidyverse,我们可以使用gather

    library(dplyr)
    library(tidyr)
    df %>% 
       gather %>% 
       count(value)
    #              value n
    #1      home,leisure 1
    #2         home,work 3
    #3      leisure,work 1
    #4         work,home 3
    #5 work,home,leisure 1
    

    数据

    df <- structure(list(X1 = c("home,work", "leisure,work", "home,leisure"
    ), X2 = c("work,home", "work,home,leisure", "work,home"), X3 = c("home,work", 
    "work,home", "home,work")), class = "data.frame", row.names = c(NA, -3L))
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2021-09-10
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多