【发布时间】:2021-03-25 12:18:39
【问题描述】:
下面给出了一个示例数据框和一个包含要编码的列信息的列表。
# Dataframe
DF <- data.frame("genres" = c("pop", "pop","jazz","rock","jazz","blues","rock","pop","blues","pop"),
"colors" = c("orange","red","red","orange","green","blue","orange","red","blue","green"),
"values" = c(12, 15, 24, 33 ,47, 2 , 9 ,6, 89, 75),
"genres number 12" = c("r","r","?","l","?","r","l","r","r","r"),
"genres number 17" = c("l","l","?","r","?","l","r","l","l","l"),
"colors number 3" = c("r","l","l","r","?","r","r","l","r","?"),
"colors number 10" = c("r","l","l","r","l","r","r","l","r","l"),
check.names = FALSE
)
# Encoding list
EncodingList <- list("genres number 17", "colors number 3")
names(EncodingList) <- c("colors number 3", "genres number 12")
当观察到特定值时,我想对另一列中的一列进行编码。例如EncodingList 中的第一个元素是"colors number 3",其对应的名称是"genres number 17"。当DF 的"genres number 17" 列中的值为? 时,我们应该用"colors number 3" 具有的任何对应值(“r”、“l”或“?”)填充该行。预期输出如下所示。 EncodingList很长,最好用循环遍历。
expectedDF <- data.frame("genres" = c("pop", "pop","jazz","rock","jazz","blues","rock","pop","blues","pop"),
"colors" = c("orange","red","red","orange","green","blue","orange","red","blue","green"),
"values" = c(12, 15, 24, 33 ,47, 2 , 9 ,6, 89, 75),
"genres number 12" = c("r","r","?","l","?","r","l","r","r","r"),
"genres number 17" = c("l","l","l","r","?","l","r","l","l","l"),
"colors number 3" = c("r","l","l","r","?","r","r","l","r","r"),
"colors number 10" = c("r","l","l","r","l","r","r","l","r","l"),
check.names = FALSE
)
【问题讨论】: