【问题标题】:How to summarize a variable based on another column in R?如何根据 R 中的另一列汇总变量?
【发布时间】:2022-07-12 21:52:13
【问题描述】:

我有一个如下所示的数据集:

  study_id weight gender
1      100     55   Male
2      200     65 Female
3      300     84 Female
4      400     59   Male
5      500     62 Female
6      600     75   Male
7      700     70   Male

我想找到权重变量的平均值、中位数等(summary() 函数给出的所有内容),但对于男性和女性分别

换句话说,我想分别找到男性和女性体重变量的汇总统计数据。

我该怎么做?

可重现的数据:

data<-data.frame(study_id=c("100","200","300","400","500","600","700"),weight=c("55","65","84","59","62","75","70"),gender=c("Male","Female","Female","Male","Female","Male","Male"))

【问题讨论】:

标签: r mean median summary


【解决方案1】:

虽然harre有合理的建议,但我更喜欢这样:

library(dplyr)

data  |>
    group_by(gender)  |>
    mutate(weight = as.numeric(weight))  |>
    summarise(
        across(weight, list(mean = mean, median = median))
    )
# # A tibble: 2 x 3
#   gender weight_mean weight_median
#   <chr>        <dbl>         <dbl>
# 1 Female        70.3          65
# 2 Male          64.8          64.5

mutate(across()) 的优点是,如果您有 2 列或 5 列,您可以轻松扩展它,例如mutate(across(weight:height))docs 中有更多这样的例子。

【讨论】:

    猜你喜欢
    • 2015-07-15
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多