【发布时间】:2018-12-07 13:27:07
【问题描述】:
我有一张病例表和一张对照表。我想使用年龄和性别的精确匹配来创建一组匹配的控件。我还想指定控件在病例死亡日期之前至少有一年的数据 (dod)。
数据如下:
nControls <- 10e4
nCases <- 10e2
start_date <- as.Date('2011-04-01')
end_date <- as.Date('2016-04-01')
ages <- paste0(seq(0, 75, 5), '-', seq(4, 79, 5))
nAges <- length(ages)
controls <- data.frame(
id = seq_len(nControls),
start = sample(seq(start_date, end_date, by = 'year'), size = nControls, replace = T),
dur = sample(1:5, nControls, replace = T) * 365.25,
age = sample(ages, nControls, replace = T, prob = 1:nAges / sum(1:nAges)),
sex = sample(c('m', 'f'), nControls, replace = T, prob = c(0.7, 0.3)))
controls$end <- controls$start + controls$dur
cases <- data.frame(
id = seq_len(nCases),
dod = sample(seq(as.Date('2011/04/01'), as.Date('2016/04/01'), by = 'day'), size = nCases, replace = T),
age = sample(ages, nCases, replace = T),
sex = sample(c('m', 'f'), nCases, replace = T))
只需手动或使用MatchIt 包即可轻松完成年龄和性别匹配:
controls$treat <- 0
cases$treat <- 1
mt <- rbind(controls[,c('treat', 'age', 'sex')], cases[,c('treat', 'age', 'sex')])
m.out <- matchit(treat ~ age + sex, data = mt, exact = c('age', 'sex'), method = 'nearest', ratio = 2)
但我不知道如何包含cases$dod 应该在controls$end 之前并且至少在controls$start 之后一年的条件。
【问题讨论】:
标签: r bioinformatics