【问题标题】:add specific new row using dplyr based on some conditions in r根据 r 中的某些条件使用 dplyr 添加特定的新行
【发布时间】:2019-05-22 08:03:41
【问题描述】:

我有如下的df,我想添加一个基于IDsemester_num 的新行。到目前为止使用dplyr 将是:

df %>%
 group_by(ID) %>%
 group_by(semster_num) %>%
 #add new row here  

我希望新行包含与上一行相似的所有记录除了第三列值 (subject_result2) 应与上一行的第 4 列 (Success) 相同。

tibble::tribble(
      ~ID, ~semester_num,   ~subject_result2,    ~Success,
  100000L,             1, "OTHERPassedTerm1", "Grad_ENSC",
  100000L,             1, "OTHERPassedTerm1", "Grad_ENSC",
  100000L,             1, "OTHERPassedTerm1", "Grad_ENSC",
  100000L,             2, "MATH1PassedTerm1", "Grad_ENSC",
  100000L,             2, "OTHERPassedTerm1", "Grad_ENSC",
  100000L,             2, "OTHERPassedTerm1", "Grad_ENSC",
  200000L,             1, "OTHERPassedTerm2", "fail",
  200000L,             1, "MATH1PassedTerm2", "fail",
  200000L,             2, "MATH1PassedTerm2", "fail",
  200000L,             2, "OTHERPassedTerm2", "fail"
  )

结果:(我表示新添加的行)

          ~ID, ~semester_num,   ~subject_result2,    ~Success,
      100000L,             1, "OTHERPassedTerm1", "Grad_ENSC",
      100000L,             1, "OTHERPassedTerm1", "Grad_ENSC",
      100000L,             1, "OTHERPassedTerm1", "Grad_ENSC",
 >>   100000L,             1, "Grad_ENSC",        "Grad_ENSC",
      100000L,             2, "MATH1PassedTerm1", "Grad_ENSC",
      100000L,             2, "OTHERPassedTerm1", "Grad_ENSC",
      100000L,             2, "OTHERPassedTerm1", "Grad_ENSC",
 >>   100000L,             2, "Grad_ENSC",        "Grad_ENSC",
      200000L,             1, "OTHERPassedTerm2", "Grad_ENSC",
      200000L,             1, "MATH1PassedTerm2", "fail",
 >>   200000L,             1, "Fail",             "fail",
      200000L,             2, "MATH1PassedTerm2", "fail",
      200000L,             2, "OTHERPassedTerm2", "fail",
 >>   200000L,             2, "fail,              "fail

请帮助在 r 中实现它。 (也可以使用其他包)

【问题讨论】:

    标签: r dplyr


    【解决方案1】:

    您可以通过将dotibble::add_row 组合来实现此目的。我这个答案是基于这个问题的答案:Add row in each group using dplyr and add_row(),特别是@JasonWang的评论

    df %>%
        dplyr::group_by(ID, semester_num) %>%
        do(tibble::add_row(.,
                           ID = .$ID[1],
                           semester_num = .$semester_num[1],
                           subject_result2 = .$Success[nrow(.)], #Get the last row of the group
                           Success = .$Success[nrow(.)]))
    
    # A tibble: 14 x 4
    # Groups:   ID, semester_num [4]
           ID semester_num subject_result2  Success  
        <int>        <dbl> <chr>            <chr>    
     1 100000            1 OTHERPassedTerm1 Grad_ENSC
     2 100000            1 OTHERPassedTerm1 Grad_ENSC
     3 100000            1 OTHERPassedTerm1 Grad_ENSC
     4 100000            1 Grad_ENSC        Grad_ENSC
     5 100000            2 MATH1PassedTerm1 Grad_ENSC
     6 100000            2 OTHERPassedTerm1 Grad_ENSC
     7 100000            2 OTHERPassedTerm1 Grad_ENSC
     8 100000            2 Grad_ENSC        Grad_ENSC
     9 200000            1 OTHERPassedTerm2 fail     
    10 200000            1 MATH1PassedTerm2 fail     
    11 200000            1 fail             fail     
    12 200000            2 MATH1PassedTerm2 fail     
    13 200000            2 OTHERPassedTerm2 fail     
    14 200000            2 fail             fail  
    

    通常tibble::add_row 不适用于分组数据框,但通过使用do,我们可以将其分别应用于每个组而无需离开管道。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-07-16
      • 1970-01-01
      相关资源
      最近更新 更多