【问题标题】:Creating new data table based on ifelse on old data table在旧数据表上基于 ifelse 创建新数据表
【发布时间】:2017-02-03 06:28:38
【问题描述】:

我正在尝试使用 ifelse 语句对我的数据表进行子集化,但没有得到我想要的结果。

我的初始数据表如下所示:

head(Data_copy, n = 18)

    Company       Date       DOW variable value Year Month End_of_Month
 1:   ASXRI 1991-09-06    Friday       RI    NA 1991   Sep            0
 2:   ASXRI 1991-09-09    Monday       RI    NA 1991   Sep            0
 3:   ASXRI 1991-09-10   Tuesday       RI    NA 1991   Sep            0
 4:   ASXRI 1991-09-11 Wednesday       RI    NA 1991   Sep            0
 5:   ASXRI 1991-09-12  Thursday       RI    NA 1991   Sep            0
 6:   ASXRI 1991-09-13    Friday       RI    NA 1991   Sep            0
 7:   ASXRI 1991-09-16    Monday       RI    NA 1991   Sep            0
 8:   ASXRI 1991-09-17   Tuesday       RI    NA 1991   Sep            0
 9:   ASXRI 1991-09-18 Wednesday       RI    NA 1991   Sep            0
10:   ASXRI 1991-09-19  Thursday       RI    NA 1991   Sep            0
11:   ASXRI 1991-09-20    Friday       RI    NA 1991   Sep            0
12:   ASXRI 1991-09-23    Monday       RI    NA 1991   Sep            0
13:   ASXRI 1991-09-24   Tuesday       RI    NA 1991   Sep            0
14:   ASXRI 1991-09-25 Wednesday       RI    NA 1991   Sep            0
15:   ASXRI 1991-09-26  Thursday       RI    NA 1991   Sep            0
16:   ASXRI 1991-09-27    Friday       RI    NA 1991   Sep            0
17:   ASXRI 1991-09-30    Monday       RI    NA 1991   Sep            1
18:   ASXRI 1991-10-01   Tuesday       RI    NA 1991   Oct            0

这是 250,000 行中的 18 行。

我想要的是根据 ifelse 函数拆分这个数据表,如下所示:

Data1 <- ifelse("Weekly" == "Weekly", Data_copy[End_of_Month ==1,], Data_copy)

*"Weekly" == "Weekly" 位稍后将在函数中使用。

我希望 Data1 是一个新的数据表,它只包含 End_of_Month ==1 的行。

当我运行上面的代码时,我发现我得到了一个公司名称的列表,就是这样。

我会告诉你输出是什么样子的:

Data1[[1]]
    [1] "ASXRI" "ASXRI" "ASXRI" "ASXRI" "ASXRI" "ASXRI" "ASXRI" "ASXRI" "ASXRI" "ASXRI" "ASXRI"

现在,如果我进一步向下滚动,我会得到:

[1387] "AANRI" "AANRI" "AANRI" "AANRI" "AANRI" "AANRI" "APARI" "APARI" "APARI" "APARI" "APARI"
 [1398] "APARI" "APARI" "APARI" "APARI" "APARI" "APARI" "APARI" "APARI" "APARI" "APARI" "APARI"

这些条目中的每一个都只是公司名称之一。

如果我这样做,我会得到我想要的结果:

Data2 <- Data_copy[End_of_Month == 1, ]

Company       Date      DOW variable value Year Month End_of_Month
1:   ASXRI 1991-09-30   Monday       RI    NA 1991   Sep            1
2:   ASXRI 1991-10-31 Thursday       RI    NA 1991   Oct            1
3:   ASXRI 1991-11-29   Friday       RI    NA 1991   Nov            1
4:   ASXRI 1991-12-31  Tuesday       RI    NA 1991   Dec            1
5:   ASXRI 1992-01-31   Friday       RI    NA 1992   Jan            1
6:   ASXRI 1992-02-28   Friday       RI    NA 1992   Feb            1

基本上我想复制 Data2,但使用 ifelse 语句。

这是前 100 行:

dput(head(Data_copy, n = 100))
structure(list(Company = c("ASXRI", "ASXRI", "ASXRI", "ASXRI", 
"ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", 
"ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", 
"ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", 
"ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", 
"ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", 
"ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", 
"ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", 
"ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", 
"ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", 
"ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", 
"ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", 
"ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", 
"ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI", 
"ASXRI", "ASXRI", "ASXRI", "ASXRI", "ASXRI"), Date = structure(c(7918, 
7921, 7922, 7923, 7924, 7925, 7928, 7929, 7930, 7931, 7932, 7935, 
7936, 7937, 7938, 7939, 7942, 7943, 7944, 7945, 7946, 7949, 7950, 
7951, 7952, 7953, 7956, 7957, 7958, 7959, 7960, 7963, 7964, 7965, 
7966, 7967, 7970, 7971, 7972, 7973, 7974, 7977, 7978, 7979, 7980, 
7981, 7984, 7985, 7986, 7987, 7988, 7991, 7992, 7993, 7994, 7995, 
7998, 7999, 8000, 8001, 8002, 8005, 8006, 8007, 8008, 8009, 8012, 
8013, 8014, 8015, 8016, 8019, 8020, 8021, 8022, 8023, 8026, 8027, 
8028, 8029, 8030, 8033, 8034, 8035, 8036, 8037, 8040, 8041, 8042, 
8043, 8044, 8047, 8048, 8049, 8050, 8051, 8054, 8055, 8056, 8057
), class = "Date"), DOW = c("Friday", "Monday", "Tuesday", "Wednesday", 
"Thursday", "Friday", "Monday", "Tuesday", "Wednesday", "Thursday", 
"Friday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", 
"Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Monday", 
"Tuesday", "Wednesday", "Thursday", "Friday", "Monday", "Tuesday", 
"Wednesday", "Thursday", "Friday", "Monday", "Tuesday", "Wednesday", 
"Thursday", "Friday", "Monday", "Tuesday", "Wednesday", "Thursday", 
"Friday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", 
"Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Monday", 
"Tuesday", "Wednesday", "Thursday", "Friday", "Monday", "Tuesday", 
"Wednesday", "Thursday", "Friday", "Monday", "Tuesday", "Wednesday", 
"Thursday", "Friday", "Monday", "Tuesday", "Wednesday", "Thursday", 
"Friday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", 
"Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Monday", 
"Tuesday", "Wednesday", "Thursday", "Friday", "Monday", "Tuesday", 
"Wednesday", "Thursday", "Friday", "Monday", "Tuesday", "Wednesday", 
"Thursday", "Friday", "Monday", "Tuesday", "Wednesday", "Thursday"
), variable = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("RI", 
"VO", "MV", "TD", "ND"), class = "factor"), value = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_), Year = c("1991", "1991", "1991", "1991", "1991", "1991", 
"1991", "1991", "1991", "1991", "1991", "1991", "1991", "1991", 
"1991", "1991", "1991", "1991", "1991", "1991", "1991", "1991", 
"1991", "1991", "1991", "1991", "1991", "1991", "1991", "1991", 
"1991", "1991", "1991", "1991", "1991", "1991", "1991", "1991", 
"1991", "1991", "1991", "1991", "1991", "1991", "1991", "1991", 
"1991", "1991", "1991", "1991", "1991", "1991", "1991", "1991", 
"1991", "1991", "1991", "1991", "1991", "1991", "1991", "1991", 
"1991", "1991", "1991", "1991", "1991", "1991", "1991", "1991", 
"1991", "1991", "1991", "1991", "1991", "1991", "1991", "1991", 
"1991", "1991", "1991", "1991", "1991", "1992", "1992", "1992", 
"1992", "1992", "1992", "1992", "1992", "1992", "1992", "1992", 
"1992", "1992", "1992", "1992", "1992", "1992"), Month = c("Sep", 
"Sep", "Sep", "Sep", "Sep", "Sep", "Sep", "Sep", "Sep", "Sep", 
"Sep", "Sep", "Sep", "Sep", "Sep", "Sep", "Sep", "Oct", "Oct", 
"Oct", "Oct", "Oct", "Oct", "Oct", "Oct", "Oct", "Oct", "Oct", 
"Oct", "Oct", "Oct", "Oct", "Oct", "Oct", "Oct", "Oct", "Oct", 
"Oct", "Oct", "Oct", "Nov", "Nov", "Nov", "Nov", "Nov", "Nov", 
"Nov", "Nov", "Nov", "Nov", "Nov", "Nov", "Nov", "Nov", "Nov", 
"Nov", "Nov", "Nov", "Nov", "Nov", "Nov", "Dec", "Dec", "Dec", 
"Dec", "Dec", "Dec", "Dec", "Dec", "Dec", "Dec", "Dec", "Dec", 
"Dec", "Dec", "Dec", "Dec", "Dec", "Dec", "Dec", "Dec", "Dec", 
"Dec", "Jan", "Jan", "Jan", "Jan", "Jan", "Jan", "Jan", "Jan", 
"Jan", "Jan", "Jan", "Jan", "Jan", "Jan", "Jan", "Jan", "Jan"
), End_of_Month = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0)), .Names = c("Company", "Date", "DOW", "variable", "value", 
"Year", "Month", "End_of_Month"), class = c("data.table", "data.frame"
), row.names = c(NA, -100L), .internal.selfref = <pointer: 0x00000000001f0788>)

【问题讨论】:

  • 我想复制你的初始data.table
  • @jangorecki 如果您想尝试,我已经添加了一些数据。
  • 特意使用ifelse 通常是个坏主意。尽管该函数的语法很好,但它有很多缺点和限制,所以我会坚持使用你的方法,Data_copy[End_of_Month == 1]。也许我错过了一些东西,因为你没有在这里说你为什么要使用ifelse
  • 我想创建一个函数,其中输入确定数据是按月、按周还是按天频率显示。例如,我的函数的第一块看起来像这样:equal_weight &lt;- function(Data, start, end, frequency){ Data1 &lt;- ifelse(frequency == "Weekly", Data[End_of_Month == 1, ], Data) return(Data1)}` 所以在这段代码中我要输入,equal_weight(Data_copy, Weekly) * 我的开始和结束参数没有在这个块中使用'希望最终能够将频率参数设置为每周/每月/每日。这有帮助吗?
  • ifelse 是错误的功能。使用 if 和 else 或 switch。

标签: r if-statement data.table subset


【解决方案1】:

其他用户注意到ifelse 不适合您的用途。解释原因可能很有用。从?ifelseifelse(test, yes, no) 返回一个

相同长度和属性的向量(包括维度 和“类”)作为“测试”和来自“是”值的数据值 或“不”

换句话说,如果您的test 向量长度为​​1,ifelse(...) 将返回长度为1 的向量。例如,

> ifelse(TRUE, 1:3, 7:9)
[1] 1
> ifelse(c(TRUE, FALSE), 1:3, 7:9)
[1] 1 8

在你的情况下,

ifelse("Weekly" == "Weekly", Data_copy[End_of_Month ==1,], Data_copy)

将返回一个长度为 1 的向量。更准确地说,由于测试返回TRUEifelse 将返回yes 参数中的第一个元素;因为它是一个数据框(一种列表),ifelse 返回数据框的第一个元素,即第一列。这就是您获得公司名称列表的原因。如果您真的想使用ifelse 构造,请尝试

ifelse("Weekly" == "Weekly", list(Data_copy[End_of_Month ==1,]), list(Data_copy))

尽管正如其他人所说,您最好使用if {} else {}

【讨论】:

    猜你喜欢
    • 2021-11-28
    • 1970-01-01
    • 2021-05-16
    • 1970-01-01
    • 1970-01-01
    • 2015-11-12
    • 2011-11-13
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多