【发布时间】:2019-03-06 17:38:35
【问题描述】:
我有一个这样的数据集:
df.in <-structure(list(id = c(1, 1, 2, 3), x1 = c(0, 1, NA, 0), x2 = c("Lorem ipsum dolor sit amet",
"dolore eu fugiat nulla pariatur", "Sed ut perspiciatis unde omnis",
"Nemo enim ipsam voluptatem"), x3 = c("Donec ullamcorper elit quis risus",
"Donec ullamcorper elit quis risus", "Curabitur euismod", "Mauris felis orci"
)), row.names = c(NA, -4L), class = c("tbl_df", "tbl", "data.frame"
))
> df.in
# A tibble: 4 x 4
id x1 x2 x3
<dbl> <dbl> <chr> <chr>
1 1 0 Lorem ipsum dolor sit amet Donec ullamcorper elit quis risus
2 1 1 dolore eu fugiat nulla pariatur Donec ullamcorper elit quis risus
3 2 NA Sed ut perspiciatis unde omnis Curabitur euismod
4 3 0 Nemo enim ipsam voluptatem Mauris felis orci
我正在尝试dplyr::group_by() 来获取这个:
df.out <- structure(list(id = c(1, 2, 3), x1 = c(1, NA, 0), x2 = c("dolore eu fugiat nulla pariatur",
"Sed ut perspiciatis unde omnis", "Nemo enim ipsam voluptatem"
), x3 = c("Donec ullamcorper elit quis risus", "Curabitur euismod",
"Mauris felis orci")), row.names = c(NA, -3L), class = c("tbl_df",
"tbl", "data.frame"))
> df.out
# A tibble: 3 x 4
id x1 x2 x3
<dbl> <dbl> <chr> <chr>
1 1 1 dolore eu fugiat nulla pariatur Donec ullamcorper elit quis risus
2 2 NA Sed ut perspiciatis unde omnis Curabitur euismod
3 3 0 Nemo enim ipsam voluptatem Mauris felis orci
我能做到:
df.in %>%
group_by(id) %>%
summarise(x1 = max(x1))
但是,我该怎么做:
- 汇总
x2、x3以保留出现max(x1)的值? - 我有几个
x都需要相同的逻辑。有没有办法做一个summarize_all?
【问题讨论】:
标签: r group-by dplyr tidyverse