【问题标题】:group by and conditional summarize in RR中的分组和条件总结
【发布时间】:2020-10-02 04:19:40
【问题描述】:

我的代码很脏。 如果条件小于 2,则名称 = 不受欢迎。

df <- data.frame(vote=c("A","A","A","B","B","B","B","B","B","C","D"),
           val=c(rep(1,11))
           )

df %>% group_by(vote) %>% summarise(val=sum(val))
out

  vote    val
  <fct> <dbl>
1 A         3
2 B         6
3 C         1
4 D         1

但我需要

  vote    val
  <fct> <dbl>
1 A         3
2 B         6
3 unpopular 2

我的想法是

df2 <- df %>% group_by(vote) %>% summarise(val=sum(val))
df2$vote[df2$val < 2] <- "unpop"
df2 %>% group_by....

这不酷。

你知道什么很酷且有用的功能吗?

【问题讨论】:

    标签: r dplyr


    【解决方案1】:

    我们可以进行双重分组

    library(dplyr)
    df %>% 
        group_by(vote) %>% 
        summarise(val=sum(val)) %>%
        group_by(vote = replace(vote, val <2, 'unpop')) %>% 
        summarise(val = sum(val))
    

    -输出

    # A tibble: 3 x 2
    # vote    val
    #  <chr> <dbl>
    #1 A         3
    #2 B         6
    #3 unpop     2
    

    或者rowsum的另一个选项

    df %>% 
       group_by(vote = replace(vote, vote %in% 
         names(which((rowsum(val, vote) < 2)[,1])), 'unpopular')) %>% 
       summarise(val = sum(val))
    

    或者使用来自forcatsfct_lump_n

    library(forcats)
    df %>% 
      group_by(vote = fct_lump_n(vote, 2, other_level = "unpop")) %>%
      summarise(val = sum(val))
    # A tibble: 3 x 2
    #  vote    val
    #  <fct> <dbl>
    #1 A         3
    #2 B         6
    #3 unpop     2
    

    或使用table

    df %>%
       group_by(vote = replace(vote, 
          vote %in% names(which(table(vote) < 2)), 'unpop'))  %>%
       summarise(val = sum(val))
    

    【讨论】:

      【解决方案2】:

      如果您想在基础 R 中基于 sumvalvote,您可以这样做:

      aggregate(val~vote, transform(aggregate(val~vote, df, sum), 
                vote = replace(vote, val < 2, 'unpop')), sum)
      
      #   vote val
      #1     A   3
      #2     B   6
      #3 unpop   2
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2021-11-25
        • 1970-01-01
        • 1970-01-01
        • 2021-03-24
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2020-05-02
        相关资源
        最近更新 更多