【问题标题】:How to group by multiple columns in R?如何在R中按多列分组?
【发布时间】:2021-12-24 09:47:27
【问题描述】:

我需要按三列对我的数据进行分组 - 性别、年份和就业状况。

这是我的数据:

ID <- c(1000, 1000, 1000, 1001, 1001, 1001, 1001, 1001, 1002, 1002, 1002, 1002, 1002)
Gender <- as.factor(c("M","M","M","M","M","M","M","M","F","F","F","F","F"))
Employment_status <- as.factor(c("Other","Other","Other","Employed","Employed","Employed","Employed","Employed","Employed","Employed","Employed","Employed","Unemployed"))
Year <- c(2016, 2017, 2018, 2016, 2017, 2018, 2019, 2020, 2016, 2017, 2018, 2019, 2020)

my_data <- data.frame(ID, Gender, Employment_status, Year, stringsAsFactors=F)

我希望我的最终结果包含按性别和年份划分的就业率数据表。我如何在 R 中实现这一点?

预期的输出会是这样的:

谢谢!

【问题讨论】:

  • 你能包括预期的输出吗?
  • 我添加了预期的输出!这可能是不同的形式,但这些是我正在寻找的百分比!

标签: r dplyr grouping crosstab


【解决方案1】:

在基础 R 中你可以这样做:

ftable(prop.table(table(my_data[-1]), c(1, 3)), col.vars = c("Gender", "Employment_status"))


     Gender                   F                         M                 
     Employment_status Employed Other Unemployed Employed Other Unemployed
Year                                                                      
2016                        1.0   0.0        0.0      0.5   0.5        0.0
2017                        1.0   0.0        0.0      0.5   0.5        0.0
2018                        1.0   0.0        0.0      0.5   0.5        0.0
2019                        1.0   0.0        0.0      1.0   0.0        0.0
2020                        0.0   0.0        1.0      1.0   0.0        0.0

【讨论】:

    【解决方案2】:

    这大概是你想要的吗?

    library(dplyr)
    
    
    my_data %>% 
      group_by(Gender, Year) %>% 
      count(Employment_status) %>% 
      summarise(sum(n)) %>% 
      arrange(Year)
    

    输出:

       Gender  Year `sum(n)`
       <fct>  <dbl>    <int>
     1 F       2016        1
     2 M       2016        2
     3 F       2017        1
     4 M       2017        2
     5 F       2018        1
     6 M       2018        2
     7 F       2019        1
     8 M       2019        1
     9 F       2020        1
    10 M       2020        1
    

    【讨论】:

    • 如果问题不清楚,请不要回答,标记为关闭。
    猜你喜欢
    • 2015-12-02
    • 1970-01-01
    • 2012-11-13
    • 2018-09-01
    • 1970-01-01
    • 1970-01-01
    • 2017-05-09
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多