R Data.Table 检查组条件答案

【问题标题】：R Data.Table Check Group ConditionR Data.Table 检查组条件
【发布时间】：2020-03-21 17:49:11
【问题描述】：

data=data.frame("StudentID" = c(1,1,1,2,2,2,3,3,3),
"Grade"=c(1,2,3,1,2,3,1,2,3),
"Score" = c(1,2,5,2,4,3,1,2,2))

我有“数据”，想在哪里制作“数据1”

data1=data.frame("StudentID" = c(1,1,1,2,2,2,3,3,3),
"Grade"=c(1,2,3,1,2,3,1,2,3),
"Score" = c(1,2,5,2,4,3,1,2,2),
"Flag"=c(0,0,0,1,1,1,2,2,2))

Flag 的作用是表明Grade 处的Score 是否有StudentID 处的G 高于G-1。换句话说，我们预计分数只会随着年级的增加而上升。

如果有任何Score 值随着Grade 变高而下降，则Flag 等于1。并列分数应以2 表示。
如果学生在Grade、2 和3 中的得分为2，则为Flag == 2。
如果Scores 仅随着Grade 的上升而上升，那么Flag == 0。

使用@akron 完美答案，library(data.table) setDT(data)[, flag := fifelse(any(diff(Score) 0, 2, 0)) , .(StudentID)]

现在说我有一个学生的标志 2。如何通过加 1 更新他们的 SECOND 连续分数。

使用上面的data1

data1=data.frame("StudentID" = c(1,1,1,2,2,2,3,3,3),
"Grade"=c(1,2,3,1,2,3,1,2,3),
"Score" = c(1,2,5,2,4,3,1,2,2),
"Flag"=c(0,0,0,1,1,1,2,2,2),
"Score2" = c(1,2,5,2,4,3,1,2,3))

【问题讨论】：

您的解释和data1 之间存在差异，即我没有看到2 应该在哪里？
你需要library(dplyr);data1 %>% group_by(StudentID) %>% mutate(ind = case_when(any(diff(Score) < 0) ~ 1, all(2:3 %in% Grade[Score == 2]) ~ 2, TRUE ~ 0 ))
或dat.atable setDT(data)[, flag := fifelse(any(diff(Score)

标签： r data.table data-manipulation

【解决方案1】：

我们转换为'data.table'（setDT，按'StudentID'分组，使用fifelse通过检查'Score'中的anydifference小于0来创建'flag'（基本上检查值减少的情况），将其指定为1，如果有任何重复，则为2，其余为0

library(data.table)
setDT(data)[, flag := fifelse(any(diff(Score) < 0), 1, 
      fifelse(anyDuplicated(Score) > 0, 2, 0)) , .(StudentID)]

对于更新的案例

setDT(data1)[Score2 := Score][Flag == 2, Score2 := seq(Score[1], 
          length.out = .N, by = 1), StudentID]

或dplyr

library(dplyr)
data %>% 
  group_by(StudentID) %>% 
  mutate(flag = case_when(any(diff(Score) < 0) ~ 1,  
                anyDuplicated(Score) > 0 ~ 2,  TRUE ~ 0))

【讨论】：

非常感谢！毕竟，有没有办法在不指定值 2:3 的情况下做到这一点？并且没有指定分数 == 2？我问是因为这是一个更大的数字集问题的简化版本
我看到了，我修好了。我道歉！
@bvowe 我正在检查一般情况。假设您有一个 ID 组的 v1 <- c(1, 2, 2, 3, 4, 4, 5)，输出会是什么
@bvowe 你需要吗。这里c(1, 2, 3, 4, 5, 6, 7)
再次感谢您的宝贵时间，您真的是救生员