【发布时间】:2023-03-29 12:40:01
【问题描述】:
我想重塑以下data.table
library(data.table)
myfun <- function() sample(c(NA,round(runif(9)*10)),prob=c(0.2,rep(0.1,9)))
cheeze <- myfun()
bottle <- myfun()
df <- as.data.table(data.frame(ID=LETTERS[1:10],
bottle_qty=bottle,
bottle_price=bottle*c(1,3,5),
cheeze_qty=cheeze,
cheeze_price=cheeze*c(5,4,2),
cheeze_cam = 1*(cheeze>4) ,
cheeze_brie = 1*(cheeze<=4),
bottle_wine = 1*(bottle>5),
bottle_beer = 1*(bottle<=5))
)
# ID bottle_qty bottle_price cheeze_qty cheeze_price cheeze_cam cheeze_brie
# 1: A 7 7 9 45 1 0
# 2: B 4 12 6 24 1 0
# 3: C NA NA NA NA NA NA
# 4: D 7 7 2 10 0 1
# 5: E 3 9 9 36 1 0
# 6: F 9 45 4 8 0 1
# 7: G 6 6 3 15 0 1
# 8: H 2 6 6 24 1 0
# 9: I 5 25 8 16 1 0
# 10: J 7 7 3 15 0 1
# bottle_wine bottle_beer
# 1: 1 0
# 2: 0 1
# 3: NA NA
# 4: 1 0
# 5: 0 1
# 6: 1 0
# 7: 1 0
# 8: 0 1
# 9: 0 1
# 10: 1 0
如下:
| ID | type | qty | price |
| A | cheeze_cam | 9 | 45 |
| A | bottle_wine | 7 | 7 |
| B | bottle_beer | 4 | 12 |
| B | cheeze_cam | 6 | 24 |
编辑 这是完整的预期输出。
| ID | type | qty | price |
|----+-------------+-----+-------|
| A | bottle_wine | 7 | 7 |
| A | cheeze_cam | 9 | 45 |
| B | bottle_beer | 4 | 12 |
| B | cheeze_cam | 6 | 24 |
| C | bottle_wine | NA | NA |
| C | cheeze_brie | NA | NA |
| D | bottle_wine | 7 | 7 |
| D | cheeze_brie | 2 | 10 |
| E | bottle_beer | 3 | 9 |
| E | cheeze_cam | 9 | 36 |
| F | bottle_wine | 9 | 45 |
| F | cheeze_brie | 4 | 8 |
| G | bottle_wine | 6 | 6 |
| G | cheeze_brie | 3 | 15 |
| H | bottle_beer | 2 | 6 |
| H | cheeze_cam | 6 | 24 |
| I | bottle_beer | 5 | 25 |
| I | cheeze_cam | 8 | 16 |
| J | bottle_wine | 7 | 7 |
| J | cheeze_brie | 3 | 15 |
但是没有找到 x 对象。有什么帮助吗?
【问题讨论】:
-
试试
melt(melt(df, measure=patterns("qty$", "price$"), value.name=c('qty', 'price'), variable.name="var", na.rm=TRUE), id.var=c('ID','var', 'qty', 'price'), na.rm=TRUE)[order(ID)] -
很好,谢谢。你最初的建议实际上很有趣。我在它的基础上生成了这个
melt(df, id.var="ID",measure=patterns("cheeze_qty$", "cheeze_price$"), na.rm=TRUE)。但是 lapply 似乎并不能立即起作用 -
@akrun 解决方案中什么不起作用?我不明白
lapply问题的出处……你能准确地说出你的问题吗? -
@Akrun 解决方案不太正确,因为它提供了无法自动删除的重复项
-
@Akrun 你离答案更近了。我错误地认为您的结果无法细化。这是我一直在寻找的。
melt(melt(df, measure=patterns("qty$", "price$"), value.name=c('qty', 'price'), variable.name="var", na.rm=TRUE), id.var=c('ID','var', 'qty', 'price'), na.rm=TRUE)[order(ID)][value==1,][like(variable,"cheeze")&var==1|like(variable,"bottle")&var==2,]。非常好的解决方案。
标签: r data.table