【问题标题】:data rearrangement / similar to pivot table?数据重新排列/类似于数据透视表?
【发布时间】:2016-04-10 18:42:56
【问题描述】:

我正在为数据重排问题而苦苦挣扎。下面的数据包含崩溃或稳定的协议(行)(列“collapse”)和减少、保留、添加或缺失的特性条款(列“diff.pps_leadership”、“diff.pps_cabinet”等)

我想重新排列数据,以便大致了解在减少、保留或添加特定条款的协议中有多少已崩溃。行应该是规定(diff.pps_leadership ...),列应该是“减少”、“保留”和“添加”。单元格的内容应该是折叠的百分比(仅与那些减少、保留或增加了规定;而不是总数)。

在 Excle 中,我会在数据透视表中执行此操作,但我无法使用 R 到达那里。我尝试了 cast、aggregate、melt 和 transpose 命令,但没有成功。

最终,结果应该与此类似 https://docs.google.com/spreadsheets/d/1yhIbvTQTYkkwSFVxWEnPwvSvwTc0vuTYZxa15Eh1lT8/edit?usp=sharing

希望我的问题不是太具体。感谢任何提示/建议。

example <- structure(list(Agreement = structure(c(8L, 4L, 6L, 9L, 2L, 3L, 
7L, 10L, 5L, 1L), .Label = c("Abuja Agreement", "Accra Peace Agreement", 
"Arusha Agreement", "Arusha/Global Ceasefire Agreement", "Comprehensive Peace Agreement", 
"InterabsentCongolese Dialogue", "Lome Agreement", "Lusaka Protocol", 
"Ouagadougou Agreement", "Tansitional Constituion"), class = "factor"), 
    diff.pps_cabinet = structure(c(2L, 1L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L), .Label = c("kept", "reduced"), class = "factor"), 
    diff.pps_leadership = structure(c(1L, 2L, 3L, 3L, 3L, 3L, 
    3L, 3L, 2L, 3L), .Label = c("absent", "kept", "reduced"), class = "factor"), 
    diff.mps_milcmd = structure(c(3L, 2L, 3L, 3L, 3L, 3L, 1L, 
    3L, 2L, 3L), .Label = c("absent", "kept", "reduced"), class = "factor"), 
    diff.mps_armyint = structure(c(3L, 2L, 2L, 3L, 3L, 3L, 1L, 
    3L, 2L, 3L), .Label = c("absent", "kept", "reduced"), class = "factor"), 
    diff.eps_commission = structure(c(1L, 1L, 1L, 1L, 3L, 1L, 
    3L, 1L, 2L, 3L), .Label = c("absent", "kept", "reduced"), class = "factor"), 
    diff.eps_company = structure(c(1L, 2L, 1L, 1L, 3L, 1L, 1L, 
    1L, 2L, 3L), .Label = c("absent", "kept", "reduced"), class = "factor"), 
    diff.veto_leg = structure(c(1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L), .Label = c("absent", "added"), class = "factor"), 
    diff.tps_devolution = structure(c(2L, 1L, 2L, 3L, 1L, 1L, 
    1L, 2L, 2L, 1L), .Label = c("absent", "kept", "reduced"), class = "factor"), 
    diff.ca.psh = structure(c(3L, 2L, 1L, 1L, 4L, 1L, 1L, 1L, 
    4L, 1L), .Label = c("absent", "added", "kept", "reduced"), class = "factor"), 
    collapse = structure(c(1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 
    1L), .Label = c("collapse", "stable"), class = "factor")), .Names = c("Agreement", 
"diff.pps_cabinet", "diff.pps_leadership", "diff.mps_milcmd", 
"diff.mps_armyint", "diff.eps_commission", "diff.eps_company", 
"diff.veto_leg", "diff.tps_devolution", "diff.ca.psh", "collapse"
), class = "data.frame", row.names = c(NA, -10L))

【问题讨论】:

  • @akrun,只是他们在&lt;- 中使用的连字符导致了错误。

标签: r data.table aggregate reshape2 melt


【解决方案1】:

下面的工作完成了。

library(data.table)
setDT(example)

mvs <- c("diff.pps_cabinet", "diff.pps_leadership", 
         "diff.mps_milcmd", "diff.mps_armyint")

vls <- c("reduced", "kept", "added", "absent")

melt(example, c("Agreement", "collapse"), mvs
     )[ , setNames(vapply(
       vls, function(vv) list(paste0(
         s <- sum(collapse[idx <- value == vv] == "collapse"), 
         " out of ", sum(idx), " = ", floor(100 * s / sum(idx)), "% collapsed"),
         paste(Agreement[idx], collapse = "\n")),
       vector("list", 2)),
       paste0(rep(vls, each = 2),
              c(".percent", ".names"))), by = variable]

当前打印 NaN 时什么都没有;要解决此问题,请将分母中的 sum(idx) 替换为 (if (!any(idx)) 1 else sum(idx))

【讨论】:

  • 非常感谢您的努力!这已经非常接近我正在寻找的东西。不幸的是,单元格中观察的百分比和数量不是我想要的。例如什么在单元格中 diff.pps_cabinet /reduced.percent 现在是“9 / 10”应该是“5 / 9”。 9 个(全部 10 个)减少,并且在这 5 个中崩溃。
  • 非常好,非常感谢。剩下的唯一问题是我只想知道那些崩溃的协议的名称,而不是全部。如果我理解正确,这与表达式 paste(Agreement[idx], collapse = "\n") 相关,并且需要一个条件过滤掉那些崩溃的条件。认为 Agreement[idx
  • 此时,该问题的解决方法应该很清楚了。我建议你继续努力。
猜你喜欢
  • 1970-01-01
  • 2011-12-06
  • 1970-01-01
  • 2023-03-25
  • 2016-08-26
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多