【问题标题】:Collapse and summarize while maintaining most frequent character variable by group折叠和汇总,同时将唯一值保留为新变量
【发布时间】:2022-07-18 21:30:00
【问题描述】:

我有一个数据框:

df <- data.frame(resource = c("gold", "gold", "gold", "silver", "silver", "gold", "silver", "bronze"), amount = c(500, 2000, 4, 8, 100, 2000, 3, 5), unit = c("g", "g", "kg", "ton", "kg", "g", "ton", "kg"), price = c(10, 10, 10000, 50000, 50, 10, 50000, 20))

我想计算每个资源的总价值,同时将不同的唯一单位和价格保持为新变量。 结果应该是这样的

resource value  price1 unit1 price2 unit2
bronze   100    20     kg    NA     NA
gold     85000  10     g     10000  kg
silver   555000 50000  ton   50     kg

前两列是结果

df %>% group_by(resource) %>% summarize(value = sum(amount * price))

但我不知道如何保留其他列

【问题讨论】:

  • 你真正想做什么?随着数据的扩展,输出中的列数将会增加。并且没有明显的顺序将行映射到列。也许您更愿意让您的原始表格按资源排序?

标签: r collapse summarize


【解决方案1】:

我怀疑您想要的格式是否真的有用(正如 PeterK 在 cmets 中指出的那样),但我们开始吧:

df <- data.frame(resource = c("gold", "gold", "gold", "silver", "silver", "gold", "silver", "bronze"), amount = c(500, 2000, 4, 8, 100, 2000, 3, 5), unit = c("g", "g", "kg", "ton", "kg", "g", "ton", "kg"), price = c(10, 10, 10000, 50000, 50, 10, 50000, 20))

# calculate total value
DT <- setDT(df)[, .(value = sum(amount * price)), by = resource]

# create wide data
#  variables we want to cast wide
cols <- c("amount", "unit")
#  cast to wide
DT.wide <- dcast(setDT(df), resource ~ rowid(resource), value.var = cols)
new_colorder <- CJ(unique(rowid(df$resource)), cols, sorted = FALSE)[, paste(cols, V1, sep = "_")]
#  reorder the relevant columns
setcolorder(DT.wide, c(setdiff(names(DT.wide), new_colorder), new_colorder))

# join together
DT[DT.wide, on = .(resource)]

#    resource  value amount_1 unit_1 amount_2 unit_2 amount_3 unit_3 amount_4 unit_4
# 1:   bronze    100        5     kg       NA   <NA>       NA   <NA>       NA   <NA>
# 2:     gold  85000      500      g     2000      g        4     kg     2000      g
# 3:   silver 555000        8    ton      100     kg        3    ton       NA   <NA>

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2018-07-31
    • 2018-01-15
    • 2021-04-07
    • 2020-10-22
    • 2017-11-10
    • 2014-08-28
    • 1970-01-01
    • 2016-09-28
    相关资源
    最近更新 更多