【问题标题】:r count values in rows after dcastr 在 dcast 之后计算行中的值
【发布时间】:2020-12-24 10:28:14
【问题描述】:

在从reshape2 包执行dcast 操作后,我想sum 数据帧行中的所有值。问题是所有值都相同(10)并且是所有行的总和。值应该是 4,2,4 带代码的示例数据:

df <- data.frame(x = as.factor(c("A","A","A","A","B","B","C","C","C","C")),
                 y = as.factor(c("AA","AB","AA","AC","BB","BA","CC","CC","CC","CD")),
                 z = c("var1","var1","var2","var1","var2","var1","var1","var2","var2","var1"))

df2 <- df %>%
  group_by(x,y) %>%
  summarise(num = n()) %>%
  ungroup()

df3 <- dcast(df2,x~y, fill = 0 )

df3$total <- sum(df3$AA,df3$AB,df3$AC,df3$BA,df3$BB,df3$CC,df3$CD)

【问题讨论】:

    标签: r dplyr reshape2


    【解决方案1】:

    sum 为您提供 1 个组合值,并且该值对所有其他行重复。

    sum(df3$AA,df3$AB,df3$AC,df3$BA,df3$BB,df3$CC,df3$CD)
    #[1] 10
    

    您需要rowSums 来分别获取每一行的总和。

    df3$total <- rowSums(df3[-1])
    

    这是一个简化的tidyverse 方法,从df 开始:

    library(dplyr)
    library(tidyr)
    
    df %>%
      count(x, y, name = 'num') %>%
      pivot_wider(names_from = y, values_from = num, values_fill = 0) %>%
      mutate(total = rowSums(select(., AA:CD)))
    
    #  x        AA    AB    AC    BA    BB    CC    CD total
    #  <fct> <int> <int> <int> <int> <int> <int> <int> <dbl>
    #1 A         2     1     1     0     0     0     0     4
    #2 B         0     0     0     1     1     0     0     2
    #3 C         0     0     0     0     0     3     1     4
    

    【讨论】:

      【解决方案2】:

      我们可以在pivot_wider 中指定values_fn,也可以从janitor 中使用adorn_totals

      library(dplyr)
      library(tidyr)
      library(janitor)
      df %>% 
         pivot_wider(names_from = y, values_from = z, values_fill = 0, 
               values_fn = length) %>%
         adorn_totals("col")
      

      -输出

      # x AA AB AC BB BA CC CD Total
      # A  2  1  1  0  0  0  0     4
      # B  0  0  0  1  1  0  0     2
      # C  0  0  0  0  0  3  1     4
      

      或将base Rxtabsaddmargins 一起使用

      addmargins(xtabs(z ~ x + y, transform(df, z = 1)), 2)
      #   y
      #x   AA AB AC BA BB CC CD Sum
      #  A  2  1  1  0  0  0  0   4
      #  B  0  0  0  1  1  0  0   2
      #  C  0  0  0  0  0  3  1   4
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2020-11-15
        • 1970-01-01
        • 2021-09-29
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2018-11-17
        相关资源
        最近更新 更多