数据重新排列/类似于数据透视表？答案

【问题标题】：data rearrangement / similar to pivot table?数据重新排列/类似于数据透视表？
【发布时间】：2016-04-10 18:42:56
【问题描述】：

我正在为数据重排问题而苦苦挣扎。下面的数据包含崩溃或稳定的协议（行）（列“collapse”）和减少、保留、添加或缺失的特性条款（列“diff.pps_leadership”、“diff.pps_cabinet”等）

我想重新排列数据，以便大致了解在减少、保留或添加特定条款的协议中有多少已崩溃。行应该是规定（diff.pps_leadership ...），列应该是“减少”、“保留”和“添加”。单元格的内容应该是折叠的百分比（仅与那些减少、保留或增加了规定；而不是总数）。

在 Excle 中，我会在数据透视表中执行此操作，但我无法使用 R 到达那里。我尝试了 cast、aggregate、melt 和 transpose 命令，但没有成功。

最终，结果应该与此类似 https://docs.google.com/spreadsheets/d/1yhIbvTQTYkkwSFVxWEnPwvSvwTc0vuTYZxa15Eh1lT8/edit?usp=sharing

希望我的问题不是太具体。感谢任何提示/建议。

example <- structure(list(Agreement = structure(c(8L, 4L, 6L, 9L, 2L, 3L, 
7L, 10L, 5L, 1L), .Label = c("Abuja Agreement", "Accra Peace Agreement", 
"Arusha Agreement", "Arusha/Global Ceasefire Agreement", "Comprehensive Peace Agreement", 
"InterabsentCongolese Dialogue", "Lome Agreement", "Lusaka Protocol", 
"Ouagadougou Agreement", "Tansitional Constituion"), class = "factor"), 
    diff.pps_cabinet = structure(c(2L, 1L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L), .Label = c("kept", "reduced"), class = "factor"), 
    diff.pps_leadership = structure(c(1L, 2L, 3L, 3L, 3L, 3L, 
    3L, 3L, 2L, 3L), .Label = c("absent", "kept", "reduced"), class = "factor"), 
    diff.mps_milcmd = structure(c(3L, 2L, 3L, 3L, 3L, 3L, 1L, 
    3L, 2L, 3L), .Label = c("absent", "kept", "reduced"), class = "factor"), 
    diff.mps_armyint = structure(c(3L, 2L, 2L, 3L, 3L, 3L, 1L, 
    3L, 2L, 3L), .Label = c("absent", "kept", "reduced"), class = "factor"), 
    diff.eps_commission = structure(c(1L, 1L, 1L, 1L, 3L, 1L, 
    3L, 1L, 2L, 3L), .Label = c("absent", "kept", "reduced"), class = "factor"), 
    diff.eps_company = structure(c(1L, 2L, 1L, 1L, 3L, 1L, 1L, 
    1L, 2L, 3L), .Label = c("absent", "kept", "reduced"), class = "factor"), 
    diff.veto_leg = structure(c(1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L), .Label = c("absent", "added"), class = "factor"), 
    diff.tps_devolution = structure(c(2L, 1L, 2L, 3L, 1L, 1L, 
    1L, 2L, 2L, 1L), .Label = c("absent", "kept", "reduced"), class = "factor"), 
    diff.ca.psh = structure(c(3L, 2L, 1L, 1L, 4L, 1L, 1L, 1L, 
    4L, 1L), .Label = c("absent", "added", "kept", "reduced"), class = "factor"), 
    collapse = structure(c(1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 
    1L), .Label = c("collapse", "stable"), class = "factor")), .Names = c("Agreement", 
"diff.pps_cabinet", "diff.pps_leadership", "diff.mps_milcmd", 
"diff.mps_armyint", "diff.eps_commission", "diff.eps_company", 
"diff.veto_leg", "diff.tps_devolution", "diff.ca.psh", "collapse"
), class = "data.frame", row.names = c(NA, -10L))

【问题讨论】：

@akrun，只是他们在<- 中使用的连字符导致了错误。

标签： r data.table aggregate reshape2 melt

【解决方案1】：

下面的工作完成了。

library(data.table)
setDT(example)

mvs <- c("diff.pps_cabinet", "diff.pps_leadership", 
         "diff.mps_milcmd", "diff.mps_armyint")

vls <- c("reduced", "kept", "added", "absent")

melt(example, c("Agreement", "collapse"), mvs
     )[ , setNames(vapply(
       vls, function(vv) list(paste0(
         s <- sum(collapse[idx <- value == vv] == "collapse"), 
         " out of ", sum(idx), " = ", floor(100 * s / sum(idx)), "% collapsed"),
         paste(Agreement[idx], collapse = "\n")),
       vector("list", 2)),
       paste0(rep(vls, each = 2),
              c(".percent", ".names"))), by = variable]

当前打印 NaN 时什么都没有；要解决此问题，请将分母中的 sum(idx) 替换为 (if (!any(idx)) 1 else sum(idx))。

【讨论】：

非常感谢您的努力！这已经非常接近我正在寻找的东西。不幸的是，单元格中观察的百分比和数量不是我想要的。例如什么在单元格中 diff.pps_cabinet /reduced.percent 现在是“9 / 10”应该是“5 / 9”。 9 个（全部 10 个）减少，并且在这 5 个中崩溃。
非常好，非常感谢。剩下的唯一问题是我只想知道那些崩溃的协议的名称，而不是全部。如果我理解正确，这与表达式 paste(Agreement[idx], collapse = "\n") 相关，并且需要一个条件过滤掉那些崩溃的条件。认为 Agreement[idx
此时，该问题的解决方法应该很清楚了。我建议你继续努力。