【发布时间】:2018-09-19 05:13:27
【问题描述】:
我用weka做主成分分析,结果肯定是错的。
我的实例如下:
40.4,24.7,7.2,6.1,8.3,8.7,2.442,20
25,12.7,11.2,11,12.9,20.2,3.542,9.1
13.2,3.3,3.9,4.3,4.4,5.5,0.578,3.6
22.3,6.7,5.6,3.7,6,7.4,0.176,7.3
34.3,11.8,7.1,7.1,8,8.9,1.726,27.5
35.6,12.5,16.4,16.7,22.8,29.3,3.017,26.6
22,7.8,9.9,10.2,12.6,17.6,0.847,10.6
48.4,13.4,10.9,9.9,10.9,13.9,1.772,17.8
40.6,19.1,19.8,19,29.7,39.6,2.449,35.8
24.8,8,9.8,8.9,11.9,16.2,0.789,13.7
12.5,9.7,4.2,4.2,4.6,6.5,0.874,3.9
1.8,0.6,0.7,0.7,0.8,1.1,0.056,1
32.3,13.9,9.4,8.3,9.8,13.3,2.126,17.1
38.5,9.1,11.3,9.5,12.2,16.4,1.327,11.6
26.2,10.1,5.6,15.6,7.7,30.1,0.126,25.9
我的java代码如下:
PrincipalComponents pca = new PrincipalComponents();
pca.buildEvaluator(instances);
pca.setVarianceCovered(0.9);
instances=pca.transformedData(instances);
System.out.println(instances);
结果如下:
-0.76617,2.661828,-0.543741,0
-0.970913,0.436367,1.69961,0
2.881824,-0.434979,0.32666,0
2.202041,-0.118079,-0.265614,0
-0.055269,0.917633,-0.825503,0
-3.389144,-0.661234,0.756936,0
0.326235,-0.94073,0.256852,0
-1.020299,0.939242,-0.408135,0
-5.193605,-0.979272,-0.020702,0
0.337214,-0.689053,-0.018816,0
2.413215,0.213961,0.314493,0
4.426397,-0.617956,0.288353,0
-0.373545,0.837791,0.108058,0
-0.347075,-0.059153,0.119701,0
-0.470905,-1.506368,-1.788153,0
但我确信正确的结果如下:
0.76617,2.661828,0.543741,0
0.970913,0.436367,-1.69961,0
-2.881824,-0.434979,-0.32666,0
-2.202041,-0.118079,0.265614,0
0.055269,0.917633,0.825503,0
3.389144,-0.661234,-0.756936,0
-0.326235,-0.94073,-0.256852,0
1.020299,0.939242,0.408135,0
5.193605,-0.979272,0.020702,0
-0.337214,-0.689053,0.018816,0
-2.413215,0.213961,-0.314493,0
-4.426397,-0.617956,-0.288353,0
0.373545,0.837791,-0.108058,0
0.347075,-0.059153,-0.119701,0
0.470905,-1.506368,1.788153,0
第一列和第三列(第一主成分和第三主成分)的符号(正数或负数)相反。
我在stackoverflow上搜索了我的错误线索,但我找不到我的错误,所以有人能找出我的代码或weka代码有问题吗?
【问题讨论】:
-
非常感谢nekomatic的回答,结果没有错,但是每个主成分的符号很重要,因为我想加上主成分来排名。那么重点是什么如果我使用主成分分析进行综合排名,或者我如何确定主成分和特征向量的符号。
标签: components weka principal