如果它始终是导致问题的名字,您可以使用正则表达式来摆脱它。请注意,我先将所有因子转换为字符。
df1 <- data.frame(name="RANDI FIRAT CAYLIOGLU", correct = 30, stringsAsFactors = F)
df2 <- data.frame(name="FIRAT CAYLIOGLU",id = 01, stringsAsFactors = F)
libray(dpylr)
df1%>%
mutate(name2 = sub("^[A-Za-z]+ ", "", name)) %>%
full_join(df2, by = c("name2" = "name"))
name correct name2 id
1 RANDI FIRAT CAYLIOGLU 30 FIRAT CAYLIOGLU 1
如果它也可以是中间名,您可以创建一个额外的列 name3,其中只包含名字和姓氏:
libray(dpylr)
df1%>%
mutate(name2 = sub("^[A-Za-z]+ ", "", name),
name3 = sub(" [A-Za-z]+ ", " ", name) %>%
left_join(df2, by = c("name2" = "name")) %>%
left_join(df2, by = c("name3" = "name"))
这里,name2 是中间名和姓氏,name3 包含名字和姓氏。