【问题标题】:Using the melt function [duplicate]使用融化功能[重复]
【发布时间】:2019-09-27 20:48:48
【问题描述】:

我正在尝试在 R 中重塑我的数据表。

我尝试过使用 melt 功能,但似乎无法将其转换为我需要的格式。

这是我的输入:

structure(list(Name = c("Fred", "Peter"), first.sale = c("3/01/2019", 
"10/08/2018"), first.result = c(352L, 209L), second.sale = c("5/12/2018", 
"20/06/2018"), second.result = c(953L, 987L), third.sale = c("2/10/2018", 
"21/02/2018"), third.result = c(965L, 618L), fourth.sale = c("29/08/2018", 
"16/07/2018"), fourth.result = c(125L, 902L), fifth.sale = c("26/04/2018", 
"5/07/2018"), fifth.result = c(264L, 71L)), .Names = c("Name", 
"first.sale", "first.result", "second.sale", "second.result", 
"third.sale", "third.result", "fourth.sale", "fourth.result", 
"fifth.sale", "fifth.result"), row.names = c(NA, -2L), class = c("data.table", 
"data.frame"))

这就是我想要的输出

structure(list(Name = c("Fred", "Fred", "Fred", "Fred", "Fred", 
"Peter", "Peter", "Peter", "Peter", "Peter", "Frank", "Frank"
), Sale = c("first.sale", "second.sale", "third.sale", "fourth.sale", 
"fifth.sale", "first.sale", "second.sale", "third.sale", "fourth.sale", 
"fifth.sale", "first.sale", "second.sale"), Result = c(352L, 
953L, 965L, 125L, 264L, 209L, 987L, 618L, 902L, 71L, 848L, 410L
), SaleDate = c("3/01/2019", "5/12/2018", "2/10/2018", "29/08/2018", 
"26/04/2018", "10/08/2018", "20/06/2018", "21/02/2018", "16/07/2018", 
"5/07/2018", "10/08/2018", "5/12/2018")), .Names = c("Name", 
"Sale", "Result", "SaleDate"), class = "data.frame", row.names = c(NA, 
-12L))

但这就是我尝试使用融化时得到的结果

structure(list(Name = c("Fred", "Peter", "Fred", "Peter", "Fred", 
"Peter", "Fred", "Peter", "Fred", "Peter"), first.sale = c("3/01/2019", 
"10/08/2018", "3/01/2019", "10/08/2018", "3/01/2019", "10/08/2018", 
"3/01/2019", "10/08/2018", "3/01/2019", "10/08/2018"), second.sale = c("5/12/2018", 
"20/06/2018", "5/12/2018", "20/06/2018", "5/12/2018", "20/06/2018", 
"5/12/2018", "20/06/2018", "5/12/2018", "20/06/2018"), third.sale = c("2/10/2018", 
"21/02/2018", "2/10/2018", "21/02/2018", "2/10/2018", "21/02/2018", 
"2/10/2018", "21/02/2018", "2/10/2018", "21/02/2018"), fourth.sale = c("29/08/2018", 
"16/07/2018", "29/08/2018", "16/07/2018", "29/08/2018", "16/07/2018", 
"29/08/2018", "16/07/2018", "29/08/2018", "16/07/2018"), fifth.sale = c("26/04/2018", 
"5/07/2018", "26/04/2018", "5/07/2018", "26/04/2018", "5/07/2018", 
"26/04/2018", "5/07/2018", "26/04/2018", "5/07/2018"), variable = structure(c(1L, 
1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L), class = "factor", .Label = c("first.result", 
"second.result", "third.result", "fourth.result", "fifth.result"
)), value = c(352L, 209L, 953L, 987L, 965L, 618L, 125L, 902L, 
264L, 71L)), .Names = c("Name", "first.sale", "second.sale", 
"third.sale", "fourth.sale", "fifth.sale", "variable", "value"
), row.names = c(NA, -10L), class = c("data.table", "data.frame"
))

如果有人能指出正确的方向,我将永远感激不尽。

我认为我的问题是我的变量有两个值,但不知道如何对它们进行分组。

【问题讨论】:

  • 我已经完成了这些示例,但仍然无法正确塑造它。我认为这是因为我有 2 个变量值?但我似乎不明白如何将这些组合在一起。任何提示都会很棒。

标签: r data.table


【解决方案1】:

你可以用melt点赞

library(data.table)
melt(setDT(df), id="Name", measure=patterns("sale$", "result$"),
                value.name=c("SaleDate", "Result"))


#     Name variable   SaleDate Result
# 1:  Fred        1  3/01/2019    352
# 2: Peter        1 10/08/2018    209
# 3:  Fred        2  5/12/2018    953
# 4: Peter        2 20/06/2018    987
# 5:  Fred        3  2/10/2018    965
# 6: Peter        3 21/02/2018    618
# 7:  Fred        4 29/08/2018    125
# 8: Peter        4 16/07/2018    902
# 9:  Fred        5 26/04/2018    264
#10: Peter        5  5/07/2018     71

要根据this 的答案正确获取变量名,我们可以这样做

suff <- unique(sub('\\..*', '', names(df)[-1]))

B2 <- melt(setDT(df), id="Name", measure=patterns("sale$", "result$"),
                      value.name=c("SaleDate", "Result"))
setattr(B2$variable, "levels", suff)

B2
#     Name variable   SaleDate Result
# 1:  Fred    first  3/01/2019    352
# 2: Peter    first 10/08/2018    209
# 3:  Fred   second  5/12/2018    953
# 4: Peter   second 20/06/2018    987
# 5:  Fred    third  2/10/2018    965
# 6: Peter    third 21/02/2018    618
# 7:  Fred   fourth 29/08/2018    125
# 8: Peter   fourth 16/07/2018    902
# 9:  Fred    fifth 26/04/2018    264
#10: Peter    fifth  5/07/2018     71

或者tidyverse 方式是

library(tidyverse)
df %>%
  gather(key, value, -Name) %>%
  group_by(key = sub(".*\\.", "", key)) %>%
  mutate(row = row_number()) %>%
  spread(key, value) %>%
  select(-row)

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2021-12-26
    • 1970-01-01
    • 1970-01-01
    • 2023-03-29
    • 1970-01-01
    相关资源
    最近更新 更多