同意@Michael Dewar 的观点,您可能需要考虑使用NA 处理缺失数据,除非您的编码方案/数据收集有充分的理由引导您。
同样使用tidyverse,您可以从tidyr 中的fill 填写缺少的diet 值,在day 和species 相同的行中。
例如:
library(dplyr)
library(tidyr)
df$diet <- replace(df$diet, df$diet == 'na', NA)
df %>%
group_by(day, species) %>%
fill(diet, .direction = "downup")
示例数据不包括可能发生这种情况的实例。下面是一个带有不同数据的示例来演示,并创建一个新列daily.diet:
df %>%
group_by(day, species) %>%
mutate(daily.diet = diet) %>%
fill(daily.diet, .direction = "downup")
输出
day time species diet daily.diet
<dbl> <dbl> <chr> <chr> <chr>
1 1 5 a NA NA
2 1 6 b NA NA
3 1 7 c green green
4 1 9 c NA green
5 2 5 c NA NA
6 2 7 b blue blue
7 3 9 a red red
8 3 5 a NA red
9 3 9 b NA NA
数据
df <- structure(list(day = c(1, 1, 1, 1, 2, 2, 3, 3, 3), time = c(5,
6, 7, 9, 5, 7, 9, 5, 9), species = c("a", "b", "c", "c", "c",
"b", "a", "a", "b"), diet = c(NA, NA, "green", NA, NA, "blue",
"red", NA, NA)), row.names = c(NA, -9L), class = "data.frame")