【发布时间】:2018-11-30 15:17:51
【问题描述】:
我有数据框dd(问题底部的输入):
# A tibble: 6 x 2
# Groups: Date [5]
Date keeper
<chr> <lgl>
1 1/1/2018 TRUE
2 2/1/2018 TRUE
3 3/1/2018 FALSE
4 4/1/2018 FALSE
5 3/1/2018 TRUE
6 5/1/2018 TRUE
请注意,它已按日期分组。我正在尝试创建另一列,如果组中只有一行,则将“keeper”变为 TRUE,否则保留 keeper 的值。这看起来很简单,但是当我尝试这个时,我得到了以下结果:
dd %>% mutate(moose=ifelse(n()==1,TRUE,keeper))
# A tibble: 6 x 3
# Groups: Date [5]
Date keeper moose
<chr> <lgl> <lgl>
1 1/1/2018 TRUE TRUE
2 2/1/2018 TRUE TRUE
3 3/1/2018 FALSE FALSE
4 4/1/2018 FALSE TRUE
5 3/1/2018 TRUE FALSE
6 5/1/2018 TRUE TRUE
请注意,第 3 行和第 5 行具有相同的日期,因此它们应该只保留新列的 keeper 中的内容 - 但它们都变成了 FALSE。我错过了什么?
预期输出:
Date keeper moose
<chr> <lgl> <lgl>
1 1/1/2018 TRUE TRUE
2 2/1/2018 TRUE TRUE
3 3/1/2018 FALSE FALSE
4 4/1/2018 FALSE TRUE
5 3/1/2018 TRUE TRUE
6 5/1/2018 TRUE TRUE
(注意第 5 行)
这是数据框的输出:
dd<-structure(list(Date = c("1/1/2018", "2/1/2018", "3/1/2018", "4/1/2018",
"3/1/2018", "5/1/2018"), keeper = c(TRUE, TRUE, FALSE, FALSE,
TRUE, TRUE)), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -6L), vars = "Date", drop = TRUE, indices = list(
0L, 1L, c(2L, 4L), 3L, 5L), group_sizes = c(1L, 1L, 2L, 1L,
1L), biggest_group_size = 2L, labels = structure(list(Date = c("1/1/2018",
"2/1/2018", "3/1/2018", "4/1/2018", "5/1/2018")), class = "data.frame", row.names = c(NA,
-5L), vars = "Date", drop = TRUE, indices = list(0L, 1L, 2L,
4L, 3L, 5L), group_sizes = c(1L, 1L, 1L, 1L, 1L, 1L), biggest_group_size = 1L, labels = structure(list(
Date = c("1/1/2018", "2/1/2018", "3/1/2018", "3/1/2018",
"4/1/2018", "5/1/2018"), keeper = c(TRUE, TRUE, FALSE, TRUE,
FALSE, TRUE)), class = "data.frame", row.names = c(NA, -6L
), vars = c("Date", "keeper"), drop = TRUE, .Names = c("Date",
"keeper")), .Names = "Date"), .Names = c("Date", "keeper"))
附录:
当我继续使用这个数据框时,我发现如果我首先使用add_count 创建一个列n,并在我的ifelse 中引用该列而不是n(),我会得到结果我在找。这是什么原因造成的?为什么n() 没有给我同样的结果?
【问题讨论】:
标签: r if-statement dplyr