【问题标题】:Create a new column based on two conditions根据两个条件创建一个新列
【发布时间】:2019-07-17 23:37:14
【问题描述】:

我有这个数据集:

yDF = structure(list(Date = structure(c(1L,2L,3L,4L,5L,6L,7L,9L,9L,10L,11L), .Label = c("3/31/2018","4/1/2018", "4/2/2018", "4/3/2018", "4/4/2018", 
                                                                                        "4/5/2018", "4/6/2018", "4/8/2018", "4/8/2018", "4/9/2018","4/10/2018"), class = "factor"), 
                     Group = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
                                         2L), .Label = c("Decrease", "Increase"), class = "factor"), 
                     ID = c(5002, 5002, 5002, 5002, 5002, 5002, 5002, 5002, 
                            5002, 5002, 5002), Week = c(13, 13, 14, 14, 14, 14, 14, 
                                                        14, 14, 15, 15)), row.names = 1:10, class = "data.frame")

应该是这样的:


    Date        Group       ID      Week
1   3/31/2018   Increase    5002    13
2   4/1/2018    Increase    5002    13
3   4/2/2018    Increase    5002    14
4   4/3/2018    Increase    5002    14
5   4/4/2018    Increase    5002    14
6   4/5/2018    Increase    5002    14
7   4/6/2018    Increase    5002    14
8   4/8/2018    Increase    5002    14
9   4/8/2018    Increase    5002    14
10  4/9/2018    Increase    5002    15

我想添加一个名为“会话”的新列

  • 如果 ID 列和周列是新的,或者如果它们的值相同且在原始值的 2 周内,则该行标记为“pre”。
  • 如果同一 ID(如 5002)的周数为“原始周 + 2 周或更多”(如第 15 周),则应将其标记为“发布”。

我玩过 if else 函数,但无法获得正确的输出。理想情况下,它看起来像这样:

(请记住,我有超过 100 个唯一的主题 ID)

```
        Date    Group   ID          Week   Session
30 3/31/2018 Increase 5002           13    Pre
31  4/1/2018 Increase 5002           13    Pre
32  4/2/2018 Increase 5002           14    Pre
33  4/3/2018 Increase 5002           14    Pre
34  4/4/2018 Increase 5002           14    Pre
35  4/5/2018 Increase 5002           14    Pre
36  4/6/2018 Increase 5002           14    Pre
37  4/8/2018 Increase 5002           14    Pre
38  4/8/2018 Increase 5002           14    Pre
39  4/9/2018 Increase 5002           15    Post
40 4/10/2018 Increase 5002           15    Post
```

【问题讨论】:

  • 如果您能给我们提供重现此数据的代码,那就太好了。
  • 如何添加代码?将一个子集粘贴到问题中以复制并粘贴为数据框
  • @shai73 使用dput(head(your_df, 10)) 并将其输出添加到您的代码中。如果您特别想要帖子中显示的第 30 到 40 行,请使用 dput(your_df[30:40, ])
  • 谢谢,但它似乎不起作用。还有其他选择吗?我尝试了这两个建议,它一直在构建整个 CSV 文件,而不仅仅是前 10 点或 30:40

标签: r tidyverse


【解决方案1】:

我们可以使用mutateifelsefirst来完成这个任务。

library(tidyverse)

yDF2 <- yDF %>%
  group_by(ID) %>%
  mutate(Session = ifelse(Week <= first(Week) + 1, "Pre", "Post")) %>%
  ungroup()
yDF2
# # A tibble: 10 x 5
#    Date      Group       ID  Week Session
#    <chr>     <chr>    <int> <int> <chr>  
#  1 3/31/2018 Increase  5002    13 Pre    
#  2 4/1/2018  Increase  5002    13 Pre    
#  3 4/2/2018  Increase  5002    14 Pre    
#  4 4/3/2018  Increase  5002    14 Pre    
#  5 4/4/2018  Increase  5002    14 Pre    
#  6 4/5/2018  Increase  5002    14 Pre    
#  7 4/6/2018  Increase  5002    14 Pre    
#  8 4/8/2018  Increase  5002    14 Pre    
#  9 4/8/2018  Increase  5002    14 Pre    
# 10 4/9/2018  Increase  5002    15 Post  

数据

yDF <- read.table(text = "    Date        Group       ID      Week
1   '3/31/2018'   Increase    5002    13
2   '4/1/2018'    Increase    5002    13
3   '4/2/2018'    Increase    5002    14
4   '4/3/2018'    Increase    5002    14
5   '4/4/2018'    Increase    5002    14
6   '4/5/2018'    Increase    5002    14
7   '4/6/2018'    Increase    5002    14
8   '4/8/2018'    Increase    5002    14
9   '4/8/2018'    Increase    5002    14
10  '4/9/2018'    Increase    5002    15",
                  stringsAsFactors = FALSE, header = TRUE)

【讨论】:

  • 啊首先是我错过的!谢谢!
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 2020-06-02
  • 1970-01-01
  • 2018-10-24
  • 2023-03-25
  • 2022-07-12
  • 1970-01-01
  • 2018-03-06
相关资源
最近更新 更多