【发布时间】:2014-04-21 15:19:39
【问题描述】:
说,我有两张表,名字和年龄是这样的:
> name
key name
1 a,b,c jack
2 d daniel
3 e foo
4 f,g bar
> age
key age
1 b 13
2 d 21
3 e 24
4 k 34
5 f 100
我正在尝试使用两个表中都存在的键列来连接这两个表。这里的挑战是名称表中的键列未标准化。我的问题是,将上述两个表组合在一起的最佳方法是什么,以使名称表中的所有行都存在并且在连接表中保持原样(如“res”表)?
> res
key name age
1 a,b,c jack 13
2 d daniel 21
3 e foo 24
4 f,g bar 100
这里是必要的表格信息
> dput(name)
structure(list(key = structure(1:4, .Label = c("a,b,c", "d",
"e", "f,g"), class = "factor"), name = structure(c(4L, 2L, 3L,
1L), .Label = c("bar", "daniel", "foo", "jack"), class = "factor")), .Names = c("key",
"name"), class = "data.frame", row.names = c(NA, -4L))
> dput(age)
structure(list(key = structure(c(1L, 2L, 3L, 5L, 4L), .Label = c("b",
"d", "e", "f", "k"), class = "factor"), age = c(13L, 21L, 24L,
34L, 100L)), .Names = c("key", "age"), class = "data.frame", row.names = c(NA,
-5L))
> dput(res)
structure(list(key = structure(1:4, .Label = c("a,b,c", "d",
"e", "f,g"), class = "factor"), name = structure(c(4L, 2L, 3L,
1L), .Label = c("bar", "daniel", "foo", "jack"), class = "factor"),
age = c(13L, 21L, 24L, 100L)), .Names = c("key", "name",
"age"), class = "data.frame", row.names = c(NA, -4L))
【问题讨论】: