【发布时间】:2017-06-07 09:39:03
【问题描述】:
所以我有一个看起来像这样的大型数据集:
V1 V2 V3 V4
1 Sleep Domestic Eat Child Care
2 Sleep Domestic Eat Paid
3 Sleep Domestic Eat Child Care
4 Sleep Eat Paid <NA>
我想做的是reorder基于“模板”的列
["Sleep", "Eat", "Domestic", "Paid", "Child care"]
得到(输出)
V1 V2 V3 V4 V5
Sleep Eat Domestic NA Child Care
Sleep Eat Domestic Paid NA
Sleep Eat Domestic NA Child Care
Sleep Eat NA Paid NA
所以在第 1 列 Sleep,第 2 列 Eat,...
我不知道从哪里开始。 任何想法 ?
数据
x = structure(list(V1 = c("Sleep", "Sleep", "Sleep", "Sleep"), V2 = c("Domestic",
"Domestic", "Domestic", "Eat"), V3 = c("Eat", "Eat", "Eat", "Paid"
), V4 = c("Child Care", "Paid", "Child Care", NA)), .Names = c("V1",
"V2", "V3", "V4"), row.names = c(NA, 4L), class = "data.frame")
template = c('Sleep', 'Eat', 'Domestic', 'Paid', 'Child care')
【问题讨论】:
-
您的案例不匹配 - “Child care”与“Child Care”
-
我无法理解你的问题,所以让我提出我认为你在问的问题,然后你告诉我哪里错了,好吗?基本上每一列应该代表有值或没有值,例如:
[4,'V5']应该是“Child Care”(意思是“是”表示儿童保育),或“NA”表示“不”用于儿童保育。并且这些是/否值的顺序应该根据模板在每一行中排序。这是真的吗? -
@TravisHeeter 嗨,是的,实际上这是另一种看待它的方式。我没有那样想,但是是的。
-
扩展@TravisHeeter 的评论,类似
table(row(x), factor(as.matrix(x), template))可能有用