【发布时间】:2023-03-12 16:55:01
【问题描述】:
我在starwars 数据集中有两个在数据集中重复的变量,比如性别和性别。
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
data(starwars)
A <- starwars %>% select(gender, sex) %>% arrange(gender, sex)
A %>% group_by(gender, sex) %>% count()
# A tibble: 6 x 3
# Groups: gender, sex [6]
gender sex n
<chr> <chr> <int>
1 feminine female 16
2 feminine none 1
3 masculine hermaphroditic 1
4 masculine male 60
5 masculine none 5
6 NA NA 4
A <- starwars %>% select(gender, sex) %>% arrange(gender, sex); print(A)
#> # A tibble: 87 x 2
#> gender sex
#> <chr> <chr>
#> 1 feminine female
#> 2 feminine female
#> 3 feminine female
#> 4 feminine female
#> 5 feminine female
#> 6 feminine female
#> 7 feminine female
#> 8 feminine female
#> 9 feminine female
#> 10 feminine female
#> # ... with 77 more rows
在上表中,我想计算每个性别的性别数量。我想要所有“女性-女性”对的计数为 1,所有“女性-无”对的计数为 2; 1 代表所有男性-雌雄同体,2 代表男性-男性,3 代表男性-无,1 代表 NA - NA 对。
以下不是解决方案,也不是我想要的。
A %>%
group_by(gender, sex) %>%
mutate(n_dupe = seq(n()))
# Groups: gender, sex [6]
gender sex n_dupe
<chr> <chr> <int>
1 feminine female 1
2 feminine female 2
3 feminine female 3
4 feminine female 4
5 feminine female 5
6 feminine female 6
7 feminine female 7
8 feminine female 8
9 feminine female 9
10 feminine female 10
> A %>%
group_by(gender, sex) %>%
mutate(n_dupe = seq(n())) %>%
summarize(min(n_dupe), max(n_dupe))
`summarise()` has grouped output by 'gender'. You can override using the `.groups` argument.
# A tibble: 6 x 4
# Groups: gender [3]
gender sex `min(n_dupe)` `max(n_dupe)`
<chr> <chr> <int> <int>
1 feminine female 1 16
2 feminine none 1 1
3 masculine hermaphroditic 1 1
4 masculine male 1 60
5 masculine none 1 5
6 NA NA 1 4
更新
相反,我想要数据:
gender sex count
<chr> <chr>
1 feminine female 1
2 feminine female 1
3 feminine female 1
4 feminine female 1
5 feminine female 1
6 feminine female 1
7 feminine female 1
8 feminine female 1
9 feminine female 1
10 feminine female 1
11 feminine female 1
12 feminine female 1
13 feminine female 1
14 feminine female 1
15 feminine female 1
16 feminine female 1
17 feminine none 2
18 masculine hermaphroditic 1
19 masculine male 2
20 masculine male 2
... ...
76 masculine male 2
77 masculine male 2
78 masculine male 2
79 masculine none 3
80 masculine none 3
81 masculine none 3
82 masculine none 3
83 masculine none 3
84 NA NA 1
85 NA NA 1
86 NA NA 1
87 NA NA 1
数据摘要的样子
# Groups: gender [3]
gender sex `min(count)` `max(count)`
<chr> <chr> <int> <int>
1 feminine female 1 1
2 feminine none 2 2
3 masculine hermaphroditic 1 1
4 masculine male 2 2
5 masculine none 3 3
6 NA NA ` 1 1
由 reprex 包 (v1.0.0) 于 2021-06-02 部分创建
【问题讨论】: