【发布时间】:2016-10-26 01:23:12
【问题描述】:
在该程序中,我每 2.5 秒扫描一系列以 40 x 64 x 64 图像的时间序列采集的大脑样本。因此,每张图像中的“体素”(3D 像素)的数量约为 168,000 ish(40 * 64 * 64),每个都是图像样本的“特征”。
我想使用主成分分析 (PCA),因为 n 非常高,可以执行降维。然后使用递归特征消除 (RFE) 跟进。
有 9 个类要预测。因此是一个多类分类问题。下面,我将这个 9 类分类转换为二元分类问题,并将模型存储在列表 models 中。
models = []
model_count = 0
for i in range(0,DS.nClasses):
for j in range(i+1,DS.nClasses):
binary_subset = sample_classes[i] + sample_classes[j]
print 'length of combined = %d' % len(binary_subset)
X,y = zip(*binary_subset)
print 'y = ',y
estimator = SVR(kernel="linear")
rfe = RFE(estimator , step=0.05)
rfe = rfe.fit(X, y)
#save the model
models.append(rfe)
model_count = model_count + 1
print '%d model fitting complete!' % model_count
现在遍历这些模型并进行预测。
predictions = []
for X,y in test_samples:
Votes = np.zeros(DS.nClasses)
for mod in models:
#X = mod.transform(X)
label = mod.predict(X.reshape(1,-1)) #Something goes wrong here
print 'label is type',type(label),' and value ',label
Votes[int(label)] = Votes[int(label)] + 1
prediction = np.argmax(Votes)
predictions.append(prediction)
print 'Votes Array = ',Votes
print "We predicted %d , actual is %d" % (prediction,y)
标签应该是 0-8 之间的数字,表示 9 种可能的结果。我正在打印 label 值,这就是我得到的:
label is type <type 'numpy.ndarray'> and value [ 0.87011103]
label is type <type 'numpy.ndarray'> and value [ 2.09093105]
label is type <type 'numpy.ndarray'> and value [ 1.96046739]
label is type <type 'numpy.ndarray'> and value [ 2.73343935]
label is type <type 'numpy.ndarray'> and value [ 3.60415663]
label is type <type 'numpy.ndarray'> and value [ 6.10577602]
label is type <type 'numpy.ndarray'> and value [ 6.49922691]
label is type <type 'numpy.ndarray'> and value [ 8.35338294]
label is type <type 'numpy.ndarray'> and value [ 1.29765466]
label is type <type 'numpy.ndarray'> and value [ 1.60883217]
label is type <type 'numpy.ndarray'> and value [ 2.03839272]
label is type <type 'numpy.ndarray'> and value [ 2.03794106]
label is type <type 'numpy.ndarray'> and value [ 2.58830013]
label is type <type 'numpy.ndarray'> and value [ 3.28811133]
label is type <type 'numpy.ndarray'> and value [ 4.79660621]
label is type <type 'numpy.ndarray'> and value [ 2.57755697]
label is type <type 'numpy.ndarray'> and value [ 2.72263461]
label is type <type 'numpy.ndarray'> and value [ 2.58129428]
label is type <type 'numpy.ndarray'> and value [ 3.96296151]
label is type <type 'numpy.ndarray'> and value [ 4.80280219]
label is type <type 'numpy.ndarray'> and value [ 7.01768046]
label is type <type 'numpy.ndarray'> and value [ 3.3720926]
label is type <type 'numpy.ndarray'> and value [ 3.67517869]
label is type <type 'numpy.ndarray'> and value [ 4.52089242]
label is type <type 'numpy.ndarray'> and value [ 4.83746684]
label is type <type 'numpy.ndarray'> and value [ 6.76557315]
label is type <type 'numpy.ndarray'> and value [ 4.606097]
label is type <type 'numpy.ndarray'> and value [ 6.00243346]
label is type <type 'numpy.ndarray'> and value [ 6.59194317]
label is type <type 'numpy.ndarray'> and value [ 7.63559593]
label is type <type 'numpy.ndarray'> and value [ 5.8116106]
label is type <type 'numpy.ndarray'> and value [ 6.37096926]
label is type <type 'numpy.ndarray'> and value [ 7.57033285]
label is type <type 'numpy.ndarray'> and value [ 6.29465433]
label is type <type 'numpy.ndarray'> and value [ 7.91623641]
label is type <type 'numpy.ndarray'> and value [ 7.79524801]
Votes Array = [ 1. 3. 8. 5. 5. 1. 7. 5. 1.]
We predicted 2 , actual is 8
我不明白为什么 label 值是浮点数。它们应该是 0-8 之间的数字。
我正确加载了数据。执行predict() 时出了点问题,但我还是不知道是什么问题。
【问题讨论】:
标签: python machine-learning scikit-learn