【问题标题】:PLS regression coefficients in MATLAB and C# (Accord.NET)MATLAB 和 C# (Accord.NET) 中的 PLS 回归系数
【发布时间】:2017-04-02 19:26:58
【问题描述】:

我正在尝试在 C# 中执行偏最小二乘回归分析。在 MATLAB 中执行的 pls 技术使用提供 beta(回归系数矩阵)的 SIMPLS 算法。

  • 我不明白为什么这两种情况下的矩阵不同,我将输入传递给 C# 版本的方式是否有错误?

  • 此外,两者的输入相同,并且参考了此处包含的论文。

最小的工作示例

MATLAB:遵循 Hervé Abdi 的小例子(Hervé Abdi,偏最小二乘回归)。参考文献:PDF

clear all;
clc;
inputs = [7, 7, 13, 7; 4, 3, 14, 7; 10, 5, 12, 5; 16, 7, 11, 3; 13, 3, 10, 3];
outputs = [14, 7, 8; 10, 7, 6; 8, 5, 5; 2, 4,7; 6, 2, 4];
[XL,yl,XS,YS,beta,PCTVAR] = plsregress(inputs,outputs, 1);
disp 'beta'
beta
disp 'beta size'
size(beta)
yfit = [ones(size(inputs,1),1) inputs]*beta;
residuals = outputs - yfit;

% stem(residuals)
% xlabel('Observation');
% ylabel('Residual');

beta =

   1.0484e+01   6.1899e+00   6.2841e+00
  -6.3488e-01  -3.0405e-01  -7.2608e-02
   2.1949e-02   1.0512e-02   2.5102e-03
   1.9226e-01   9.2078e-02   2.1988e-02
   2.8948e-01   1.3864e-01   3.3107e-02

Accord.NET:

double[][] inputs = new double[][]
    {
        //      Wine | Price | Sugar | Alcohol | Acidity
        new double[] {   7,     7,      13,        7 },
        new double[] {   4,     3,      14,        7 },
        new double[] {  10,     5,      12,        5 },
        new double[] {  16,     7,      11,        3 },
        new double[] {  13,     3,      10,        3 },
    };

double[][] outputs = new double[][]
    {
        //             Wine | Hedonic | Goes with meat | Goes with dessert
        new double[] {           14,          7,                 8 },
        new double[] {           10,          7,                 6 },
        new double[] {            8,          5,                 5 },
        new double[] {            2,          4,                 7 },
        new double[] {            6,          2,                 4 },
    };

var pls = new PartialLeastSquaresAnalysis()
        {
            Method = AnalysisMethod.Center,
            Algorithm = PartialLeastSquaresAlgorithm.NIPALS
        };

var regression = pls.Learn(inputs, outputs);

double[][] coeffs = regression.Weights;
>>
-1.69811320754717 -0.0566037735849056   0.0707547169811322
1.27358490566038   0.29245283018868     0.571933962264151
-4                 1                    0.5
1.17924528301887   0.122641509433962    0.159198113207547

【问题讨论】:

    标签: c# matlab regression accord.net pls


    【解决方案1】:

    我认为 MATLAB 和 Accord.NET 版本的 PLS 的调用方式至少存在三个差异。

    1. 正如您提到的,MATLAB 正在使用 SIMPLS。但是,Accord.NET 被告知使用 NIPALS。

    2. MATLAB 版本被称为 plsregress(inputs, outputs, 1),这意味着在 PLS 中仅考虑 1 个潜在分量来计算回归,但是您的 Accord.NET 并没有被指示这样做。

    3. Accord.NET 返回包含权重矩阵和截距向量的 MultivariateLinearRegression 对象,而 MATLAB 将截距作为权重矩阵的第一列返回。

    一旦考虑到所有这些,就可以生成与 MATLAB 版本完全相同的结果:

    double[][] inputs = new double[][]
    {
        //      Wine | Price | Sugar | Alcohol | Acidity
        new double[] {   7,     7,      13,        7 },
        new double[] {   4,     3,      14,        7 },
        new double[] {  10,     5,      12,        5 },
        new double[] {  16,     7,      11,        3 },
        new double[] {  13,     3,      10,        3 },
    };
    
    double[][] outputs = new double[][]
    {
        //             Wine | Hedonic | Goes with meat | Goes with dessert
        new double[] {           14,          7,                 8 },
        new double[] {           10,          7,                 6 },
        new double[] {            8,          5,                 5 },
        new double[] {            2,          4,                 7 },
        new double[] {            6,          2,                 4 },
    };
    
    // Create the Partial Least Squares Analysis
    var pls = new PartialLeastSquaresAnalysis()
    {
        Method = AnalysisMethod.Center,
        Algorithm = PartialLeastSquaresAlgorithm.SIMPLS, // First change: use SIMPLS
    };
    
    // Learn the analysis
    pls.Learn(inputs, outputs);
    
    // Second change: Use just 1 latent factor/component
    var regression = pls.CreateRegression(factors: 1);
    
    // Third change: present results as in MATLAB
    double[][] w = regression.Weights.Transpose();
    double[] b = regression.Intercepts;
    
    // Add the intercepts as the first column of the matrix of
    // weights and transpose it as in the way MATLAB presents it
    double[][] coeffs = (w.InsertColumn(b, index: 0)).Transpose();
    
    // Show results in MATLAB format
    string str = coeffs.ToOctave();
    

    随着这些变化,上面的系数矩阵应该变成

    [ 10.4844779770616    6.18986077674717    6.28413863347486    ;
      -0.634878923091644 -0.304054829845448  -0.0726082626993539  ;
       0.0219492754418065 0.0105118991463605  0.00251024045589416 ;
       0.192261724966225  0.0920775662006966  0.0219881135215502  ; 
       0.289484835410222  0.13863944631343    0.033107085796122   ]
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2016-10-21
      • 2018-06-12
      • 2013-03-15
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多