【问题标题】:count rows from every previous value从每个先前的值计算行数
【发布时间】:2020-11-29 18:26:06
【问题描述】:

这个问题与here 相同,但这次我想将每个值除以前一个计数,而不是它本身。因此,对于第一个值 (1500),我们将有 NA,因为在此之前没有其他值。然后,我们将 1100 除以 4,因为前一个值 (1500) 的计数为 4。然后,我们将 200 除以 3,因为前一个值 (1100) 的计数为 3。最后,将 1100 除以 2,因为 200 的计数为 2 . 我尝试使用 shift/lag 但无法成功!

这是将每个值除以自己的计数的代码。

library(dplyr)
library(tidyverse)


df <- tibble(mydate = as.Date(c("2019-05-11 23:01:00", "2019-05-11 23:02:00", "2019-05-11 23:03:00", "2019-05-11 23:04:00",
                                "2019-05-12 23:05:00", "2019-05-12 23:06:00", "2019-05-12 23:07:00", "2019-05-12 23:08:00",
                                "2019-05-13 23:09:00", "2019-05-13 23:10:00", "2019-05-13 23:11:00", "2019-05-13 23:12:00",
                                "2019-05-14 23:13:00", "2019-05-14 23:14:00", "2019-05-14 23:15:00", "2019-05-14 23:16:00",
                                "2019-05-15 23:17:00", "2019-05-15 23:18:00", "2019-05-15 23:19:00", "2019-05-15 23:20:00")),
               myval = c(0, NA, 1500, 1500,
                         1500, 1500, NA, 0,
                         0, 0, 1100, 1100,
                         1100, 0, 200, 200,
                         1100, 1100, 1100, 0
               ))

# just replace values [0,1] with NA
df$myval[df$myval >= 0 & df$myval <= 1] <- NA


df <- df %>%
  group_by(grp = data.table::rleid(myval)) %>%
  mutate(counts = n(), 
         result= myval/counts)


#   mydate     myval   grp counts result
#   <date>     <dbl> <int>  <int>  <dbl>
# 1 2019-05-11    NA     1      2    NA 
# 2 2019-05-11    NA     1      2    NA 
# 3 2019-05-11  1500     2      4   375 
# 4 2019-05-11  1500     2      4   375 
# 5 2019-05-12  1500     2      4   375 
# 6 2019-05-12  1500     2      4   375 
# 7 2019-05-12    NA     3      4    NA 
# 8 2019-05-12    NA     3      4    NA 
# 9 2019-05-13    NA     3      4    NA 
#10 2019-05-13    NA     3      4    NA 
#11 2019-05-13  1100     4      3   367.
#12 2019-05-13  1100     4      3   367.
#13 2019-05-14  1100     4      3   367.
#14 2019-05-14    NA     5      1    NA 
#15 2019-05-14   200     6      2   100 
#16 2019-05-14   200     6      2   100 
#17 2019-05-15  1100     7      3   367.
#18 2019-05-15  1100     7      3   367.
#19 2019-05-15  1100     7      3   367.
#20 2019-05-15    NA     8      1    NA 

我想保留上面的数据框,包括日期列和正确的结果。

【问题讨论】:

    标签: r dplyr


    【解决方案1】:

    这是一种方法:

    library(dplyr)
    #Create a group number
    df1 <- df %>% mutate(grp = data.table::rleid(myval))
    
    df1 %>%
      #Keep only non-NA value
      filter(!is.na(myval)) %>%
      #count occurence of each grp
      count(grp, name = 'count') %>%
      #Shift the count to the previous group
      mutate(count = lag(count)) %>%
      #Join with the original data
      right_join(df1, by = 'grp') %>%
      #divide the count to get final result
      mutate(result = myval/count) %>%
      arrange(grp)
    

    返回

    # A tibble: 20 x 5
    #     grp count mydate     myval result
    #   <int> <int> <date>     <dbl>  <dbl>
    # 1     1    NA 2019-05-11    NA   NA  
    # 2     1    NA 2019-05-11    NA   NA  
    # 3     2    NA 2019-05-11  1500   NA  
    # 4     2    NA 2019-05-11  1500   NA  
    # 5     2    NA 2019-05-12  1500   NA  
    # 6     2    NA 2019-05-12  1500   NA  
    # 7     3    NA 2019-05-12    NA   NA  
    # 8     3    NA 2019-05-12    NA   NA  
    # 9     3    NA 2019-05-13    NA   NA  
    #10     3    NA 2019-05-13    NA   NA  
    #11     4     4 2019-05-13  1100  275  
    #12     4     4 2019-05-13  1100  275  
    #13     4     4 2019-05-14  1100  275  
    #14     5    NA 2019-05-14    NA   NA  
    #15     6     3 2019-05-14   200   66.7
    #16     6     3 2019-05-14   200   66.7
    #17     7     2 2019-05-15  1100  550  
    #18     7     2 2019-05-15  1100  550  
    #19     7     2 2019-05-15  1100  550  
    #20     8    NA 2019-05-15    NA   NA  
    
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-05-20
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多