【问题标题】:Create N duplicates of row based on variable + one extra column根据变量 + 一个额外的列创建 N 行重复
【发布时间】:2021-08-25 13:06:38
【问题描述】:

我有以下dataframe

structure(list(Phial serial = c("NC09082157761", "NC10082157882B", 
"NC10082157882C", "NC10082157882D", "NC10082157882A", "NC11082157883B", 
"NC11082157883A", "NC11082157883C", "NC11082157883D", "NC11082157883E", 
"NC11082157883G", "NC11082157883F", "NC13082157855A", "NC16082157886A", 
"NC17082157947B", "NC17082157947C", "NC17082157947A", "NC18082157948B", 
"NC18082157948C", "NC18082157948D", "NC18082157948A", "NC18082157948E", 
"NC18082157948F", "NC18082157948G", "NC18082157948H", "NC19082157949A", 
"NC20082157950A", "NC20082157950B", "NC20082157950C"), `Creation date` = structure(c(1628467200, 
1628553600, 1628553600, 1628553600, 1628553600, 1628640000, 1628640000, 
1628640000, 1628640000, 1628640000, 1628640000, 1628640000, 1628812800, 
1629072000, 1629158400, 1629158400, 1629158400, 1629244800, 1629244800, 
1629244800, 1629244800, 1629244800, 1629244800, 1629244800, 1629244800, 
1629331200, 1629417600, 1629417600, 1629417600), class = c("POSIXct", 
"POSIXt"), tzone = "UTC"), Creation time = c(730, 845, 845, 
845, 845, 730, 730, 730, 730, 730, 730, 730, 845, 730, 845, 845, 
845, 845, 845, 845, 845, 845, 845, 845, 845, 715, 730, 730, 730
), Isotope = c("TL201", "TL201", "TL201", "TL201", "TL201", "TL201", 
"TL201", "TL201", "TL201", "TL201", "TL201", "TL201", "TL201", 
"TL201", "TL201", "TL201", "TL201", "TL201", "TL201", "TL201", 
"TL201", "TL201", "TL201", "TL201", "TL201", "TL201", "TL201", 
"TL201", "TL201"), Chemical = c("CL", "CL", "CL", "CL", "CL", 
"CL", "CL", "CL", "CL", "CL", "CL", "CL", "CL", "CL", "CL", "CL", 
"CL", "CL", "CL", "CL", "CL", "CL", "CL", "CL", "CL", "CL", "CL", 
"CL", "CL"), Init activity = c(147, 145, 144, 146, 144, 147, 
147, 147, 147, 141, 141, 141, 229, 147, 144, 145, 143, 144, 144, 
144, 144, 144, 144, 144, 144, 231, 231, 231, 231), Init volume = c(2, 
2.3, 2.3, 2.3, 2.3, 3, 3, 3, 3, 3, 3, 3, 2.3, 2.3, 2.3, 2.3, 
2.3, 2.3, 2.3, 2.3, 2.3, 2.3, 2.3, 2.3, 2.3, 3, 3, 3, 3), Dispose date = structure(c(1629072000, 
1629072000, 1629072000, 1629072000, 1629072000, NA, 1629244800, 
1629244800, 1629244800, 1629244800, 1629244800, 1629244800, 1629244800, 
1629244800, 1629244800, 1629244800, 1629417600, NA, NA, NA, 1629417600, 
1629417600, 1629417600, 1629417600, 1629244800, NA, NA, NA, NA
), class = c("POSIXct", "POSIXt"), tzone = "UTC"), Dispose time = c(1624, 
1622, 1622, 1623, 1622, NA, 1535, 1535, 1536, 1536, 1536, 1536, 
1534, 1204, 1533, 1533, 1440, NA, NA, NA, 1441, 1442, 1443, 1443, 
1532, NA, NA, NA, NA)), row.names = c(NA, -29L), class = c("tbl_df", 
"tbl", "data.frame"))

这详细说明了建筑物中存在放射性物质时的情况。每个Phial serialdf$Creation date 上交付并在df$Disposal date 上删除。我新建一个字段Days in Stock

#处理尚未处置的小瓶(即NAs)

df$`Dispose date`[is.na(df$`Dispose date`)] <- end_date

#多次相同

df$`Dispose time`[is.na(df$`Dispose time`)] <- 1700
df$Days in Stock = as.Date(df$Dispose date) - as.Date(df$Creation date) + 1

现在我想根据Days in Stock 字段为df 中的每一行创建“准重复”,这很容易:

df[rep(row.names(df), df$Days in Stock),1:9]

但是,我想在重复的 data.frame Date 中创建一个额外的列。对于重复的Date 中的每一行,应从Creation Date 递增到Disposal Date。我不确定如何使用上述复制步骤来做到这一点。


例如df中的第1行

Phial serial  Creation date  Creation time  Isotope  Chemical  Init activity
Init volume  Dispose date  Dispose time
NC09082157761  2021-08-09  730  TL201  CL  147  2.0  2021-08-16  1624

应该转换成

Date  Phial serial  Creation date  Creation time  Isotope  Chemical  Init activity
    Init volume  Dispose date  Dispose time
2021-08-09  NC09082157761  2021-08-09  730  TL201  CL  147  2.0  2021-08-16  1624
2021-08-10  NC09082157761  2021-08-09  730  TL201  CL  147  2.0  2021-08-16  1624
2021-08-11  NC09082157761  2021-08-09  730  TL201  CL  147  2.0  2021-08-16  1624
2021-08-12  NC09082157761  2021-08-09  730  TL201  CL  147  2.0  2021-08-16  1624
2021-08-13  NC09082157761  2021-08-09  730  TL201  CL  147  2.0  2021-08-16  1624
2021-08-14  NC09082157761  2021-08-09  730  TL201  CL  147  2.0  2021-08-16  1624
2021-08-15  NC09082157761  2021-08-09  730  TL201  CL  147  2.0  2021-08-16  1624
2021-08-16  NC09082157761  2021-08-09  730  TL201  CL  147  2.0  2021-08-16  1624

【问题讨论】:

  • @RonakShah 已编辑帖子。

标签: r dataframe


【解决方案1】:

你可以试试这个tidyverse 答案。

我已将Dispose date 中的NA 替换为Creation date,并为每一行创建一个从Creation dateDispose date 的日期序列,并将其存储在一个列表中,该列表可以使用@987654327 在单独的行中不列出@。

library(tidyverse)

df %>%
  mutate(days_in_stock = as.Date(`Dispose date`) - as.Date(`Creation date`) + 1, 
         `Dispose date` = coalesce(`Dispose date`, `Creation date`), 
         Date = map2(as.Date(`Creation date`), as.Date(`Dispose date`), seq, by = '1 day')) %>%
  unnest(Date)

【讨论】:

  • 太棒了。非常感谢
猜你喜欢
  • 1970-01-01
  • 2019-03-01
  • 1970-01-01
  • 2021-12-09
  • 2019-07-31
  • 1970-01-01
  • 1970-01-01
  • 2021-03-02
  • 2014-06-07
相关资源
最近更新 更多