【发布时间】:2017-06-04 08:01:34
【问题描述】:
下面的 for 循环是否有矢量化解决方案。这是一个包含医疗机构入院数据的大型数据集。
已编辑
library(lubridate)
dateSeq <- as.Date(c("2015-01-01", "2015-02-01"))
admissionDate <- as.Date(c("2015-01-03", "2015-01-06", "2015-01-10", "2015-01-05", "2015-01-07", "2015-02-03", "2015-02-06"))
Dfactor <- c("elective", "acute", "elective", "acute", "acute", "elective", "acute")
Dfactor <- factor(Dfactor)
df <- data.frame(admissionDate, Dfactor)
# loop through large dataset collecting tabulated data from a factorised vector for each month (admissions date) based on 'dateSeq'
Dfactorsums <- c()
for (i in 1:length(dateSeq)) {
monthSub <- df[(df$admissionDate >= as.Date(timeFirstDayInMonth(dateSeq[i]))) & (df$admissionDate <= as.Date(timeLastDayInMonth(dateSeq[i]))), ]
x <- table(monthSub$Dfactor)
Dfactorsums[i] <- as.numeric((x[1]))
}
print(Dfactorsums)
# Outcome = [1] 3 1
# Question is rather than use a for loop is there a 'vectorized' solution.
【问题讨论】:
-
请展示一个可重现的小示例和基于该示例的预期输出。
df是什么? -
您似乎希望每月统计第二个值
Dfactor的出现次数。对吗? -
对。确切地。稍后将提供更详尽的示例。
标签: r