【问题标题】:Stack Based Column Sum in a data frame using R [duplicate]使用R的数据框中基于堆栈的列总和[重复]
【发布时间】:2020-05-31 22:04:37
【问题描述】:

我现有的数据框看起来像下面给出的线

NAV_Date    NAV    Year  Day       Units   Amount  Balance_Units
2013-06-01  282.5  2013  Saturday  3.540   1000    3.540
2013-06-08  279.3  2013  Saturday  3.581   1000    3.581
2013-06-15  276.0  2013  Saturday  3.623   1000    3.623
2013-06-22  261.6  2013  Saturday  3.822   1000    3.822
2013-06-29  273.3  2013  Saturday  3.659   1000    3.659

我希望我的新数据框有 Balance_Units 列有如下使用 R 语言给出的条目 即余额单位栏应为前值和现值之和 这需要在数据框列表上完成

NAV_Date    NAV    Year  Day       Units   Amount  Balance_Units
2013-06-01  282.5  2013  Saturday  3.540   1000    3.540
2013-06-08  279.3  2013  Saturday  3.581   1000    7.121
2013-06-15  276.0  2013  Saturday  3.623   1000    10.744
2013-06-22  261.6  2013  Saturday  3.822   1000    14.566
2013-06-29  273.3  2013  Saturday  3.659   1000    18.225

我试过了,但是不行

 for( i in 1:length(W)) {
  W[[i]]$Units  = 1000/W[[i]]$NAV
  W[[i]]$Amount = 1000
  W[[i]]$Balance_Units = 0
  W[[i]]$Balance_Units = W[[i]]$Units + W[[i]]$Balance_Units
   }

【问题讨论】:

  • 您想要累计总和(显示在您当前的帖子中)还是当前行的总和 + 上一行?

标签: r column-sum


【解决方案1】:

这可以通过base R中的便捷函数cumsum来完成

df1$Balance_Units <- cumsum(df1$Balance_Units)

使用dplyr,可以在mutate内创建

library(dplyr)
df1 %>%
    mutate(Balance_Units = cumsum(Balance_Units))

如果'W'是data.frames中的list,我们可以使用lapply

W <- lapply(W, transform, Balance_Units = cumsum(Balance_Units))

数据

df1 <- structure(list(NAV_Date = c("2013-06-01", "2013-06-08", "2013-06-15", 
"2013-06-22", "2013-06-29"), NAV = c(282.5, 279.3, 276, 261.6, 
273.3), Year = c(2013L, 2013L, 2013L, 2013L, 2013L), Day = c("Saturday", 
"Saturday", "Saturday", "Saturday", "Saturday"), Units = c(3.54, 
3.581, 3.623, 3.822, 3.659), Amount = c(1000L, 1000L, 1000L, 
1000L, 1000L), Balance_Units = c(3.54, 3.581, 3.623, 3.822, 3.659
)), class = "data.frame", row.names = c(NA, -5L))

【讨论】:

  • 你好@akrun,只是一个问题:你如何拿OP发布的表格,以避免一一输入测试代码?
  • @Alexis 只需在剪贴板中复制/粘贴后使用overflow 中的soread()
【解决方案2】:

这是data.table 解决方案。我也为顺序总和添加了一些东西..

library(data.table) 
> dat
     NAV_Date   NAV Year      Day Units Amount Balance_Units
1: 2013-06-01 282.5 2013 Saturday 3.540   1000         3.540
2: 2013-06-08 279.3 2013 Saturday 3.581   1000         3.581
3: 2013-06-15 276.0 2013 Saturday 3.623   1000         3.623
4: 2013-06-22 261.6 2013 Saturday 3.822   1000         3.822
5: 2013-06-29 273.3 2013 Saturday 3.659   1000         3.659

# Cumulative sum
> dat[, cumulative_sum := cumsum(Balance_Units)]
> dat
     NAV_Date   NAV Year      Day Units Amount Balance_Units cumulative_sum
1: 2013-06-01 282.5 2013 Saturday 3.540   1000         3.540          3.540
2: 2013-06-08 279.3 2013 Saturday 3.581   1000         3.581          7.121
3: 2013-06-15 276.0 2013 Saturday 3.623   1000         3.623         10.744
4: 2013-06-22 261.6 2013 Saturday 3.822   1000         3.822         14.566
5: 2013-06-29 273.3 2013 Saturday 3.659   1000         3.659         18.225

# Sequential sum
> dat[, sequential_sum := Balance_Units + shift(x = Balance_Units, fill = 0)]
> dat
     NAV_Date   NAV Year      Day Units Amount Balance_Units cumulative_sum sequential_sum
1: 2013-06-01 282.5 2013 Saturday 3.540   1000         3.540          3.540          3.540
2: 2013-06-08 279.3 2013 Saturday 3.581   1000         3.581          7.121          7.121
3: 2013-06-15 276.0 2013 Saturday 3.623   1000         3.623         10.744          7.204
4: 2013-06-22 261.6 2013 Saturday 3.822   1000         3.822         14.566          7.445
5: 2013-06-29 273.3 2013 Saturday 3.659   1000         3.659         18.225          7.481

【讨论】:

    猜你喜欢
    • 2016-10-15
    • 1970-01-01
    • 2019-02-21
    • 2011-11-20
    • 2020-11-12
    • 1970-01-01
    • 2013-06-10
    • 1970-01-01
    • 2011-10-22
    相关资源
    最近更新 更多