【发布时间】:2019-08-08 12:22:54
【问题描述】:
我有一个医疗数据,其中某些条件指标(即列)仅适用于某些行,但实际上,相同的条件应明确应用于属于相同治疗的所有观察结果(即program) .因此,填充 NA 似乎很简单(因为它们都被假定具有相同的值)但也不容易,因为当我应用一些先前线程(例如,here 和here)推荐的方法时,它们似乎填充字符串值有问题,如下代码所示。
有解决办法吗?
df_example <- data.frame(patient = c("A", "B", "C", "A", "B", "C", "A", "B", "C"),
status = c("Active", NA, NA, NA, "Non-Active", NA, NA, NA, "Active"),
condition = c(NA, "I", NA, NA, "II", "II", NA, NA, "III"),
program = c(1, 1, 1, 2, 2, 2, 3, 3, 3))
# I want to fill all the NA cells for columns "status" and "condition" by each program, the values should be the same for obs belonging to the same program
library("dplyr")
library("zoo")
df_example %>% group_by(program) %>% transmute(status=na.locf(status, na.rm=FALSE))
# A tibble: 9 x 2
# Groups: program [3]
program status
<dbl> <fct>
1 1 Active
2 1 Active
3 1 Active
4 2 NA
5 2 Non-Active
6 2 Non-Active
7 3 NA
8 3 NA
9 3 Active
【问题讨论】: