R - 与神经网络集成？答案

【问题标题】：R - ensemble with neural network?R - 与神经网络集成？
【发布时间】：2015-06-30 04:29:08
【问题描述】：

这是我的data.frame的一个小样本

    naiveBayesPrediction knnPred5 knnPred10 dectreePrediction logressionPrediction correctClass
1                non-bob        2         2           non-bob    0.687969711847463            1
2                non-bob        2         2           non-bob     0.85851872253358            1
3                non-bob        1         1           non-bob    0.500470892627383            1
4                non-bob        1         1           non-bob     0.77762739066215            1
5                non-bob        1         2           non-bob    0.556431439357365            1
6                non-bob        1         2           non-bob    0.604868385598237            1
7                non-bob        2         2           non-bob    0.554624186182919            1

我已经考虑了所有因素

   'data.frame':    505 obs. of  6 variables:
     $ naiveBayesPrediction: Factor w/ 2 levels "bob","non-bob": 2 2 2 2 2 2 2 2 2 2 ...
     $ knnPred5            : Factor w/ 2 levels "1","2": 2 2 1 1 1 1 2 1 2 1 ...
     $ knnPred10           : Factor w/ 2 levels "1","2": 2 2 1 1 2 2 2 1 2 2 ...
     $ dectreePrediction   : Factor w/ 1 level "non-bob": 1 1 1 1 1 1 1 1 1 1 ...
     $ logressionPrediction: Factor w/ 505 levels "0.205412826873861",..: 251 415 48 354 92 145 90 123 28 491 ...
     $ correctClass        : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...

然后我尝试使用神经网络来集成它

ensembleModel <- neuralnet(correctClass ~ naiveBayesPrediction + knnPred5 + knnPred10 + dectreePrediction + logressionPrediction, data=allClassifiers[ensembleTrainSample,])

神经元错误[[i]] %*% 权重[[i]]：需要数字/复数矩阵/向量参数

然后我尝试放入一个矩阵

m <- model.matrix( correctClass ~ naiveBayesPrediction + knnPred5 + knnPred10 + dectreePrediction + logressionPrediction, data = allClassifiers )

contrasts<-(*tmp*, value = contr.funs[1 + isOF[nn]]) 中的错误：
对比只能应用于具有 2 个或更多级别的因素

我认为这一定与只有一个级别的“decistreePrediction”功能有关，但它只能在 2 个可能的结果（bob 或非 bob）中找到一个级别，所以我不知道从那里去哪里.

【问题讨论】：

您是否可能不小心将您的allClassifiers$dectreePrediction 重命名为同一个东西（回复：your previous question）？另外，我认为将$logressionPrediction 设为一个因素是没有意义的，除非您先将其装箱。
alexforrence 感谢您的回复我不明白这个问题，我更新了我的决策树代码，看看它是否有用，如果还有什么有用的，我也可以展示一下。

标签： r neural-network

【解决方案1】：

neuralnet 函数要求“变量”为 numeric 或 complex 值，因为它正在执行需要 numeric 或 complex 参数的矩阵乘法。这在返回的错误中很清楚：

Error in neurons[[i]] %*% weights[[i]] : 
  requires numeric/complex matrix/vector arguments

这也反映在下面的小例子中。

mat <- matrix(sample(c(1,0), 9, replace=TRUE), 3)
fmat <- mat
mode(fmat) <- "character"

# no error
mat %*% mat

# error
fmat %*% fmat
Error in fmat %*% fmat : requires numeric/complex matrix/vector arguments

作为实际功能的快速演示，我将使用 infert 数据集，该数据集用作包中的演示。

library(neuralnet)
data(infert)

# error
net.infert <- neuralnet(case~as.factor(parity)+induced+spontaneous, infert)
Error in neurons[[i]] %*% weights[[i]] : 
  requires numeric/complex matrix/vector arguments

# no error
net.infert <- neuralnet(case~parity+induced+spontaneous, infert)

您可以将correctClass 保留为factor，因为无论如何它都会被转换为虚拟数字变量，但最好也将其转换为相应的二进制表示形式。

我给你的建议是：

将因子转换为相应的二进制表示（即 0 和 1）
将logressionPrediction 保留为数字
省略只有 1 个值的变量。包含这些变量是完全多余的，因为它们无法完成任何学习。

【讨论】：