【问题标题】:aggregate sum and duplicates in r [duplicate]r中的总和和重复项[重复]
【发布时间】:2016-06-27 21:52:37
【问题描述】:

我看到有人问过类似的问题,但我无法将其应用于我自己的数据。我正在尝试按产品 ID 和收入汇总价值、价值 2 和价值 3,其中价值的总和是总和。然而;我只希望 value2 和 value3 为重复项显示一个值

这是我的代码:

aggregate(Value, Value2, Value3 ~product_id + Revenue, dat,sum)

数据:

dat <-structure(list(product_id = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L),
               Date = c("January", "February", "March", "January", "February", "March", "January", "February", "March", "January", "February", "March"),
               Revenue = c("in", "in", "in", "out", "out", "out", "in", "in", "in", "out", "out", "out"),
               Value = c(0L, 1L, 0L, 0L, 0L, 0L, 1L, 2L, 3L, 0L, 0L, 0L),
           Value2 = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L),
           Value3 = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L) 
             ),
          .Names = c("product_id",  "Date", "Revenue", "Value", "Value2", "Value3"),
          class = "data.frame", row.names = c(NA, -12L))

所以它看起来像:

product i_d Revenue Value Value2 Value 3 
1           in      1     1      1
2           in      6     2      2 
1           out     0     3      3 
2           out     0     4      4

【问题讨论】:

  • 您的 dat 定义中有错字。您没有 Value2,但 Value3 被定义了两次。
  • 修复该问题后,以下内容将起作用:aggregate(cbind(Value, Value2, Value3) ~product_id + Revenue, data=dat,sum)。您可以使用cbind 包含要计算的多个变量。

标签: r sum duplicates aggregate


【解决方案1】:
dat <-structure(list(product_id = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L),
                     Date = c("January", "February", "March", "January", "February", "March", "January", "February", "March", "January", "February", "March"),
                     Revenue = c("in", "in", "in", "out", "out", "out", "in", "in", "in", "out", "out", "out"),
                     Value = c(0L, 1L, 0L, 0L, 0L, 0L, 1L, 2L, 3L, 0L, 0L, 0L),
                     Value2 = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L),
                     Value3 = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L) 
),
.Names = c("product_id",  "Date", "Revenue", "Value", "Value2", "Value3"),
class = "data.frame", row.names = c(NA, -12L))

res <- aggregate(dat[,colnames(dat) %in% c("Value", "Value2", "Value3")],by=list(dat$product_id, dat$Revenue),FUN=sum)

colnames(res) <- c("product_id", "Revenue", "Value", "Value 2", "Value 3")
res
  product_id Revenue Value Value 2 Value 3
1          1      in     1       3       3
2          2      in     6       9       9
3          1     out     0       6       6
4          2     out     0      12      12

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2013-02-04
    • 2020-11-22
    • 1970-01-01
    • 1970-01-01
    • 2021-04-21
    • 2017-01-21
    • 2021-04-15
    相关资源
    最近更新 更多