【发布时间】:2021-12-14 02:30:42
【问题描述】:
我有两个基于ID 的数据集split(),产生两个列表:july 和july2。在其中一个列表中,ID D 只有两个组件,而另一个则有 3 个组件。目标是创建一个函数来查找这些违规行为,基于列表之一没有特定数字(在本例中为 3)这一事实,并从两个列表中删除 ID。有没有有效的方法来做到这一点?
例如,每个列表都有 4 个IDs (A,B,C,D)。对于每个 ID,我在列表中为 7 月份的特定 10 天间隔创建一个数据框(例如,[[1]] 是 ID A 的 7 月的前 10 天,[[2]] 是ID A 和[[3]] 是 7 月的第二个 10 天,ID 是 7 月的第三个 10 天,对于 B、C 和 D 再次开始。) july2 列表,第三个 10 天间隔被模拟为不存在,我想要一种方法将 ID 从 july 和 july2 列表中删除,因为它缺少最后一个间隔。
library(dplyr)
library(lubridate)
ID <- rep(c("A","B","C", "D"), 5000)
date <- rep_len(seq(dmy("01-01-2010"), dmy("31-01-2011"), by = "days"), 500)
x <- runif(length(date), min = 60000, max = 80000)
y <- runif(length(date), min = 800000, max = 900000)
ID2 <- rep(c("A", "B", "C", "D"), 5000)
date2 <- rep_len(seq(dmy("01-01-2010"), dmy("21-01-2011"), by = "days"), 500)
x2 <- runif(length(date2), min = 60000, max = 80000)
y2 <- runif(length(date2), min = 800000, max = 900000)
df <- data.frame(date = date,
x = x,
y =y,
ID)
df2 <- data.frame(date = date2,
x = x2,
y =y2,
ID2)
df$jDate <- julian(as.Date(df$date), origin = as.Date("1970-01-01"))
df$Month <- month(df$date)
df2$jDate <- julian(as.Date(df2$date), origin = as.Date("1970-01-01"))
df2$Month <- month(df2$date)
july <- df %>%
# Creates a new column assigning the first day in the 10-day interval in which
# the date falls under (e.g., 01-03-2021 would be in the first 10-day interval
# so the `floor_date` assigned to it would be 01-01-2021)
mutate(new = floor_date(date, "10 days")) %>%
# For any months that has 31 days, the 31st day would normally be assigned its
# own interval. The code below takes the 31st day and joins it with the
# previous interval.
group_by(ID) %>%
mutate(new = if_else(day(new) == 31, new - days(10), new)) %>%
group_by(new, .add = TRUE) %>%
# Filter the data by the season based on the `season_categ` column
filter(Month == "7") %>%
group_split()
july2 <- df2 %>%
# Creates a new column assigning the first day in the 10-day interval in which
# the date falls under (e.g., 01-03-2021 would be in the first 10-day interval
# so the `floor_date` assigned to it would be 01-01-2021)
mutate(new = floor_date(date, "10 days")) %>%
# For any months that has 31 days, the 31st day would normally be assigned its
# own interval. The code below takes the 31st day and joins it with the
# previous interval.
group_by(ID2) %>%
mutate(new = if_else(day(new) == 31, new - days(10), new)) %>%
group_by(new, .add = TRUE) %>%
# Filter the data by the season based on the `season_categ` column
filter(Month == "7") %>%
group_split()
july2 <- july2[-12]
names(july) <- sapply(july, function(x) paste(x$ID[1]))
names(july2) <- sapply(july2, function(x) paste(x$ID2[1]))
【问题讨论】:
-
这个问题与另一个问题有点不同,因为两个列表都包含相同的
IDs,我想根据特定标准浏览一个列表并删除@ 987654346@ 来自两个列表。 -
区别在于
july2对于IDD只有两个列表元素,而july对于ID有三个列表元素。