【问题标题】:How to calculate if a day is between two dates?如何计算一天是否在两个日期之间?
【发布时间】:2021-11-26 09:23:39
【问题描述】:

我有每个人的进入和退出日期的数据。我的数据如下所示:

  ID      entry       exit
1  1 01.01.2010 31/01/2016
2  2 01/02/2016 01/01/2021
3  3 01/10/2010 30/09/2019
4  4 01.09.2013 31/12/2015
5  5 01/01/2016 30/09/2020

我想创建额外的列,我想在其中确定个人是否在给定日期(进入和退出日期之间)处于活动状态。

数据应如下所示:

 ID      entry    exit           active_31_12_2011    active_31_12_2016
1  1 01/01/2010 31/01/2016               yes                no
2  2 01/02/2016 01/01/2021                no               yes
3  3 01/10/2010 30/09/2019               yes               yes
4  4 01/09/2013 31/12/2015                no                no
5  5 01/01/2016 30/09/2020                no               yes

有人知道我该怎么做吗?

【问题讨论】:

    标签: r dataframe date


    【解决方案1】:

    使用data.table::between

    df %>%
      mutate(entry = as.Date(entry, format = "%d/%m/%Y"),
             exit = as.Date(exit, format = "%d/%m/%Y")) %>%
      rowwise %>%
      mutate(active_31_12_2011 = between(as.Date("2011/12/31"),lower = entry, upper = exit),
             active_31_12_2016 = between(as.Date("2016/12/31"),lower = entry, upper = exit))
    
         ID entry      exit       active_31_12_2011 active_31_12_2016
      <int> <date>     <date>     <lgl>             <lgl>            
    1     1 2010-01-01 2016-01-31 TRUE              FALSE            
    2     2 2016-02-01 2021-01-01 FALSE             TRUE             
    3     3 2010-10-01 2019-09-30 TRUE              TRUE             
    4     4 2013-09-01 2015-12-31 FALSE             FALSE            
    5     5 2016-01-01 2020-09-30 FALSE             TRUE  
    

    要获得是/否输出,

    df %>%
      mutate(entry = as.Date(entry, format = "%d/%m/%Y"),
             exit = as.Date(exit, format = "%d/%m/%Y")) %>%
      rowwise %>%
      mutate(active_31_12_2011 = ifelse(between(as.Date("2011/12/31"),lower = entry, upper = exit), "Yes", "No"),
             active_31_12_2016 = ifelse(between(as.Date("2016/12/31"),lower = entry, upper = exit), "Yes", "No"))
    
         ID entry      exit       active_31_12_2011 active_31_12_2016
      <int> <date>     <date>     <chr>             <chr>            
    1     1 2010-01-01 2016-01-31 Yes               No               
    2     2 2016-02-01 2021-01-01 No                Yes              
    3     3 2010-10-01 2019-09-30 Yes               Yes              
    4     4 2013-09-01 2015-12-31 No                No               
    5     5 2016-01-01 2020-09-30 No                Yes   
    

    【讨论】:

    • 谢谢,成功了!
    【解决方案2】:

    这是使用lubridate的方法:

    library(tidyverse)
    library(lubridate)
    #> 
    #> Attaching package: 'lubridate'
    #> The following objects are masked from 'package:base':
    #> 
    #>     date, intersect, setdiff, union
    
    data <- tribble(
      ~ID, ~entry, ~exit,
      1, "01/01/2010", "31/01/2016",
      2, "01/02/2016", "01/01/2021",
      3, "01/10/2010", "30/09/2019",
      4, "01/09/2013", "31/12/2015",
      5, "01/01/2016", "30/09/2020"
    )
    
    data %>%
      mutate_at(c("entry", "exit"), ~ .x %>% parse_date(format = "%d/%m/%Y")) %>%
      mutate(
        interval = interval(entry, exit),
        active_31_12_2011 = ymd("2011-12-31") %within% interval,
        active_31_12_2016 = ymd("2016-12-31") %within% interval
      )
    #> # A tibble: 5 x 6
    #>      ID entry      exit       interval                       active_31_12_2011
    #>   <dbl> <date>     <date>     <Interval>                     <lgl>            
    #> 1     1 2010-01-01 2016-01-31 2010-01-01 UTC--2016-01-31 UTC TRUE             
    #> 2     2 2016-02-01 2021-01-01 2016-02-01 UTC--2021-01-01 UTC FALSE            
    #> 3     3 2010-10-01 2019-09-30 2010-10-01 UTC--2019-09-30 UTC TRUE             
    #> 4     4 2013-09-01 2015-12-31 2013-09-01 UTC--2015-12-31 UTC FALSE            
    #> 5     5 2016-01-01 2020-09-30 2016-01-01 UTC--2020-09-30 UTC FALSE            
    #> # … with 1 more variable: active_31_12_2016 <lgl>
    

    reprex package (v2.0.1) 于 2021-10-06 创建

    【讨论】:

      【解决方案3】:

      当日期为 dates 或格式为 2010-01-01 时,您可以使用 ==&gt;&lt;&gt;=&lt;=

      x$active_31_12_2011 <- c("no", "yes")[1 + (x$entry <= "2011-12-31" & "2011-12-31" <= x$exit)]
      x$active_31_12_2016 <- c("no", "yes")[1 + (x$entry <= "2016-12-31" & "2016-12-31" <= x$exit)]
      
      x
      #  ID      entry       exit active_31_12_2011 active_31_12_2016
      #1  1 2010-01-01 2016-01-31               yes                no
      #2  2 2016-02-01 2021-01-01                no               yes
      #3  3 2010-10-01 2019-09-30               yes               yes
      #4  4 2013-09-01 2015-12-31                no                no
      #5  5 2016-01-01 2020-09-30                no               yes
      

      数据:

      x <- read.table(header=TRUE, text="ID      entry       exit
      1  1 01/01/2010 31/01/2016
      2  2 01/02/2016 01/01/2021
      3  3 01/10/2010 30/09/2019
      4  4 01/09/2013 31/12/2015
      5  5 01/01/2016 30/09/2020")
      x$entry <- as.Date(x$entry, "%d/%m/%Y")
      x$exit <- as.Date(x$exit, "%d/%m/%Y")
      

      【讨论】:

        【解决方案4】:

        您可以自愿将日期强制转换为整数并使用data.table::between

        dat_n <- as.data.frame(lapply(dat, as.integer))
        
        library(data.table)
        act <- t(apply(dat_n[-1], 1, \(z) between(as.numeric(x), z[1], z[2]))) |>
          `colnames<-`(paste0('active_', x))
        res <- cbind(dat, act)
        res
        #   ID      entry       exit active_2011-12-31 active_2016-12-31
        # 1  1 2010-01-01 2016-01-31              TRUE             FALSE
        # 2  2 2016-02-01 2021-01-01             FALSE              TRUE
        # 3  3 2010-10-01 2019-09-30              TRUE              TRUE
        # 4  4 2013-09-01 2015-12-31             FALSE             FALSE
        # 5  5 2016-01-01 2020-09-30             FALSE              TRUE
        

        数据

        dat <- structure(list(ID = 1:5, entry = structure(c(14610, 16832, 14883, 
        15949, 16801), class = "Date"), exit = structure(c(16831, 18628, 
        18169, 16800, 18535), class = "Date")), row.names = c("1", "2", 
        "3", "4", "5"), class = "data.frame")
        
        x <- c(as.Date('2011-12-31'), as.Date('2016-12-31')) 
        

        【讨论】:

          猜你喜欢
          • 2022-08-15
          • 1970-01-01
          • 2012-02-26
          • 2010-10-07
          • 2014-07-08
          • 1970-01-01
          • 2015-02-26
          相关资源
          最近更新 更多