【发布时间】:2019-01-25 00:11:48
【问题描述】:
首先,对于即将出现的一些糟糕、不合逻辑、笨拙的代码,我们深表歉意。我对 for 循环和函数的经验很少。
本质上,我想将函数应用于数据框。此函数提供一个值 [i],条件是数据框中的两列中的值。然后,我希望将此值填充到新列中,并与包含生成它的值的行对齐。
这是使用一些已经生成的模型值来创建动物物种的预测丰度。
我创建了一个相当糟糕的函数,与生成模型的已知值保持一致。
以下是数据示例:
structure(list(X = 2:6, x = c(23.69772329, 23.33799932, 24.50995071,
22.37691419, 31.29742091), y = c(-18.75309389, -18.28537894,
-19.39926585, -19.23678464, -5.251863724), EVAP_Value = c(502L,
541L, 750L, 476L, 571L), HFI_Value = c(1, 1, 3.059409052, 2.250018061,
7), TERMAC_Value = c(605L, 605L, 118L, 605L, 236L), TERMAC_ShortName =
structure(c(4L,
4L, 1L, 4L, 2L), .Label = c("DAWS2", "EASM", "Marsh", "PV"), class =
"factor"),
GLOBCOV_Value = c(30L, 30L, 30L, 140L, 130L), Glob_ShortName =
structure(c(5L,
5L, 5L, 1L, 4L), .Label = c("Grass", "OpBdFrst", "OpNdFrst",
"Shrub", "VegCrop"), class = "factor"), Unknown_Value = c(527L,
546L, 488L, 430L, 1020L), Location = structure(c(1L, 1L,
1L, 1L, 2L), .Label = c("BWA", "TZA"), class = "factor"),
NDVI_mean = c(0.26736562, 0.28850313, 0.328852412, 0.271927773,
0.364711006), Random_Category = structure(c(2L, 2L, 2L, 2L,
1L), .Label = c("Random_Maasai", "Random_Southern"), class = "factor"),
num = c(1L, 1L, 1L, 1L, 1L), ID = structure(c(1L, 1L, 1L,
1L, 1L), .Label = "Random", class = "factor")), row.names = 2:6, class =
"data.frame")
供参考,如下所示:
X x y EVAP_Value HFI_Value TERMAC_Value
1 1 37.97434 -8.833364 1390 6.000000 601
2 2 23.69772 -18.753094 502 1.000000 605
3 3 23.33800 -18.285379 541 1.000000 605
4 4 24.50995 -19.399266 750 3.059409 118
5 5 22.37691 -19.236785 476 2.250018 605
6 6 31.29742 -5.251864 571 7.000000 236
TERMAC_ShortName GLOBCOV_Value Glob_ShortName Unknown_Value
1 <NA> 90 OpNdFrst 1038
2 PV 30 VegCrop 527
3 PV 30 VegCrop 546
4 DAWS2 30 VegCrop 488
5 PV 140 Grass 430
6 EASM 130 Shrub 1020
Location NDVI_mean Random_Category num ID
1 TZA 0.5356669 Random_Maasai 1 Random
2 BWA 0.2673656 Random_Southern 1 Random
3 BWA 0.2885031 Random_Southern 1 Random
4 BWA 0.3288524 Random_Southern 1 Random
5 BWA 0.2719278 Random_Southern 1 Random
6 TZA 0.3647110 Random_Maasai 1 Random
感兴趣的两列是TERMAC_ShortName 列和Glob_ShortName 列。到目前为止,我的努力是:
predict.bayes.animal <- function(data){
if (data$TERMAC_ShortName[i] == "PV") {
bayes_value[i] <- i - 0.772
}
if (data$TERMAC_ShortName[i] == "DAWS2") {
bayes_value[i] <- i - 1.24
}
if (data$TERMAC_ShortName[i] == "EASM") {
bayes_value[i] <- i - 0.362
}
if (data$Glob_ShortName[i] == "VegCrop") {
bayes_value[i] <- i - 0.3497
}
if (data$Glob_ShortName[i] == "Grass") {
bayes_value[i] <- i - 0.5978
}
if (data$Glob_ShortName[i] == "Shrub") {
bayes_value[i] <- i - 0.2285
}
if (data$TERMAC_ShortName[i] == "PV" | data$Glob_ShortName[i] ==
"VegCrop") {
bayes_value[i] <- i - 0.56
}
if (data$TERMAC_ShortName[i] == "DAWS2" | data$Glob_ShortName[i] ==
"VegCrop")
{
bayes_value[i] <- i + 0.43
}
if (data$TERMAC_ShortName[i] == "PV" | data$Glob_ShortName[i] ==
"Grass") {
bayes_value[i] <- i - 0.49
}
if (data$TERMAC_ShortName[i] == "EASM" | data$Glob_ShortName[i] ==
"Shrub") {
bayes_value[i] <- i - 0.045
}
bayes_value
}
data["bayes_value"] <- NA
for (i in 1:nrow(data)) {
n <- predict.bayes.animal(data)
data$bayes_value[i] <- n
}
预期结果是:
X x y EVAP_Value HFI_Value TERMAC_Value
1 1 23.69772 -18.753094 502 1.000000 605
2 2 23.33800 -18.285379 541 1.000000 605
3 3 24.50995 -19.399266 750 3.059409 118
4 4 22.37691 -19.236785 476 2.250018 605
5 5 31.29742 -5.251864 571 7.000000 236
TERMAC_ShortName GLOBCOV_Value Glob_ShortName Unknown_Value
1 PV 30 VegCrop 527
2 PV 30 VegCrop 546
3 DAWS2 30 VegCrop 488
4 PV 140 Grass 430
5 EASM 130 Shrub 1020
Location NDVI_mean Random_Category num ID bayes_value
1 BWA 0.2673656 Random_Southern 1 Random -1.68
2 BWA 0.2885031 Random_Southern 1 Random -1.68
3 BWA 0.3288524 Random_Southern 1 Random -1.20
4 BWA 0.2719278 Random_Southern 1 Random -1.86
5 TZA 0.3647110 Random_Maasai 1 Random -0.64
到目前为止的实际结果是“predict.bayes.animal(data) 中的错误:找不到对象'bayes_value'”
提前感谢您的帮助。
【问题讨论】:
-
data["bayes_value"]行中是否需要额外的逗号?例如data[, "bayes_value"] -
我刚刚检查过,不幸的是错误保持不变:(。我认为这可能是由于函数中“bayes_value”的分配不正确......但我不知道该怎么做分配它。
-
assign函数怎么样? -
仍然遇到同样的错误。我在 for 循环和函数中分别尝试了
assign,然后在两者中都尝试了,但两次迭代都不起作用。再次可能是由于放置,当“bayes_value”/assign("bayes_value", 1:i)放置在其他位置时,assign 可能是解决方案。 -
你到底想做什么?为什么不直接返回贝叶斯列,然后在调用函数时执行
data$bayes <- predict.bayes.animal(data)