【发布时间】:2021-06-17 10:58:40
【问题描述】:
我想用df 中的NA 替换de 列,使用df2 中的估算值得到df3。
我可以用left_join 和coalesce 做到这一点,但我认为这种方法不能很好地推广。有没有更好的办法?
library(tidyverse)
df <- tibble(c = c("a", "a", "a", "b", "b", "b"),
d = c(1, 2, 3, 1, 2, 3),
x = c(1, NA, 3, 4, 5,6),
y = c(1, 2, NA, 4, 5, 6),
z = c(1, 2, 7, 4, 5, 6))
# I want to replace NA in df by df2
df2 <- tibble(c = c("a", "a", "a"),
d = c(1, 2, 3),
x = c(1, 2, 3),
y = c(1, 2, 2))
# to get
df3 <- tibble(c = c("a", "a", "a", "b", "b", "b"),
d = c(1, 2, 3, 1, 2, 3),
x = c(1, 2, 3, 4, 5, 6),
y = c(1, 2, 2, 4, 5, 6),
z = c(1, 2, 7, 4, 5, 6))
# is there a better solution than coalesce?
df3 <- df %>% left_join(df2, by = c("c", "d")) %>%
mutate(x = coalesce(x.x, x.y),
y = coalesce(y.x, y.y)) %>%
select(-x.x, -x.y, -y.x, -y.y)
Created on 2021-06-17 by the reprex package (v2.0.0)
【问题讨论】:
-
所以您正在寻找一个函数,该函数显示“在指定列上连接两个数据帧,并使用第二个数据帧从第一个数据帧中填充缺失的条目”
-
更好:一个“在指定列上连接两个数据帧并使用第二个数据帧替换第一个数据帧中缺失条目的列”的函数