【发布时间】:2018-03-30 00:08:02
【问题描述】:
我正在使用 depmixS4 包来训练 HMM 的数据。但是,我的数据集不是连续的(我有周六和周日的数据,背靠背,但每周的观察次数有所不同)。例如,我的数据集可能如下所示:
1 Saturday Evening 16.2 235.84
2 Saturday Evening 23.4 235.29
3 Saturday Evening 29.4 232.79
4 Sunday Evening 24.2 233.89
5 Sunday Evening 24.2 233.66
6 Sunday Evening 24.2 233.38
7 Sunday Evening 24.2 232.99
8 Sunday Evening 25.4 233.21
9 Sunday Evening 26.8 232.37
10 Saturday Night 25.6 231.55
11 Saturday Night 24.4 231.19
12 Saturday Night 24.4 231.63
13 Saturday Night 24.4 231.71
14 Sunday Night 25.2 231.23
15 Sunday Night 25.2 231.23
14 Saturday Night 25.2 231.23
15 Saturday Night 25.2 231.23
15 Sunday Night 25.2 231.23
df = structure(list(V2 = c("Saturday", "Saturday", "Saturday", "Sunday",
"Sunday", "Sunday", "Sunday", "Sunday", "Sunday", "Saturday",
"Saturday", "Saturday", "Saturday", "Sunday", "Sunday", "Saturday",
"Saturday", "Sunday"), V3 = c("Evening", "Evening", "Evening",
"Evening", "Evening", "Evening", "Evening", "Evening", "Evening",
"Night", "Night", "Night", "Night", "Night", "Night", "Night",
"Night", "Night"), V4 = c(16.2, 23.4, 29.4, 24.2, 24.2, 24.2,
24.2, 25.4, 26.8, 25.6, 24.4, 24.4, 24.4, 25.2, 25.2, 25.2, 25.2,
25.2), V5 = c(235.84, 235.29, 232.79, 233.89, 233.66, 233.38,
232.99, 233.21, 232.37, 231.55, 231.19, 231.63, 231.71, 231.23,
231.23, 231.23, 231.23, 231.23)), .Names = c("V2", "V3", "V4",
"V5"), row.names = c(NA, -18L), class = "data.frame")
在示例中,集合 1 有 9 个观测值,集合 2 有 6 个观测值,集合 3 有 3 个观测值。我已经有一个列表,其中包含这些集合的观察次数,顺序为:[9,6,3]。我想使用列表对这部分数据进行子集化,将其传递给 depmix 函数,拟合模型,然后使用 for 循环将拟合模型的对数似然结果存储到列表中。
例如:
set.seed(1)
mod[i] <- depmix(list(V4~1, V5~1), data = dataset[i], nstates=10, family=list(gaussian(), gaussian()))
fm[i] <- fit(mod[i])
append(resultList, fm[i])
#Where [i] is the iteration of the loop, and dataset[i] corresponds to the i'th subset of length N corresponding to the i'th element in the list (in the example, the list is [9,6,3])
我意识到这是在问 2 个问题,一个是使用列表对数据框进行子集化,另一个是运行函数并将结果插入列表。
【问题讨论】:
-
对于你的第一个,你的意思是像
split(df, rep(1:3, times=c(9,6,3)))吗? -
@r2evans 是的!这样可行。如何将这些拆分用于 for 循环来运行第二组代码 i 次?
-
lapply或sapply(..., simplify=FALSE)是我的首选方法,但也存在其他方法,具体取决于您还需要什么。您可能会发现这很有帮助:stackoverflow.com/a/24376207/3358272 -
@r2evans...为什么大家都忘了
by?