【问题标题】:How to reconstruct original matrix from svd components with Spark如何使用 Spark 从 svd 组件重建原始矩阵
【发布时间】:2017-04-10 19:08:03
【问题描述】:

我想重建(近似)在 SVD 中分解的原始矩阵。有没有办法做到这一点,而不必将 V factor 本地 Matrix 转换为 DenseMatrix

这里是基于documentation的分解(注意cmets来自文档示例)

import org.apache.spark.mllib.linalg.Matrix
import org.apache.spark.mllib.linalg.SingularValueDecomposition
import org.apache.spark.mllib.linalg.Vector
import org.apache.spark.mllib.linalg.distributed.RowMatrix

val data = Array(
  Vectors.dense(1.0, 0.0, 7.0, 0.0, 0.0),
  Vectors.dense(2.0, 0.0, 3.0, 4.0, 5.0),
  Vectors.dense(4.0, 0.0, 0.0, 6.0, 7.0))

val dataRDD = sc.parallelize(data, 2)

val mat: RowMatrix = new RowMatrix(dataRDD)

// Compute the top 5 singular values and corresponding singular vectors.
val svd: SingularValueDecomposition[RowMatrix, Matrix] = mat.computeSVD(5, computeU = true)
val U: RowMatrix = svd.U  // The U factor is a RowMatrix.
val s: Vector = svd.s  // The singular values are stored in a local dense vector.
val V: Matrix = svd.V  // The V factor is a local dense matrix.

要重建原始矩阵,我必须计算 U * 对角线 (s) * transpose(V)。

首先是将奇异值向量s转换为对角矩阵S

import org.apache.spark.mllib.linalg.Matrices
val S = Matrices.diag(s)

但是当我尝试计算 U * 对角线 * transpose(V) 时:我得到以下错误。

val dataApprox = U.multiply(S.multiply(V.transpose))

我收到以下错误:

错误:类型不匹配; 找到:org.apache.spark.mllib.linalg.Matrix 必需:org.apache.spark.mllib.linalg.DenseMatrix

如果我将 Matrix V 转换为 DenseMatrix Vdense,它会起作用

import org.apache.spark.mllib.linalg.DenseMatrix
val Vdense = new DenseMatrix(V.numRows, V.numCols,  V.toArray)
val dataApprox = U.multiply(S.multiply(Vdense.transpose))

有没有办法在不进行这种转换的情况下从 svd 的输出中获取原始矩阵 dataApprox 的近似值?

【问题讨论】:

    标签: scala apache-spark svd


    【解决方案1】:

    以下代码为me工作了

    //numTopSingularValues=Features used for SVD
    val latentFeatureArray=s.toArray
    
    //Making a ListBuffer to Make a DenseMatrix for s
    var denseMatListBuffer=ListBuffer.empty[Double]
    val zeroListBuffer=ListBuffer.empty[Double]
    var addZeroIndex=0
    while (addZeroIndex < numTopSingularValues )
      {
        zeroListBuffer+=0.0D
        addZeroIndex+=1
      }
    var addDiagElemIndex=0
    while(addDiagElemIndex<(numTopSingularValues-1))
      {
        denseMatListBuffer+=latentFeatureArray(addDiagElemIndex)
        denseMatListBuffer.appendAll(zeroListBuffer)
        addDiagElemIndex+=1
      }
    denseMatListBuffer+=latentFeatureArray(numTopSingularValues-1)
    
    val sDenseMatrix=new DenseMatrix(numTopSingularValues,numTopSingularValues,denseMatListBuffer.toArray)
    
    val vMultiplyS=V.multiply(sDenseMatrix)
    
    val postMulWithUDenseMat=vMultiplyS.transpose
    
    val dataApprox=U.multiply(postMulWithUDenseMat)
    

    【讨论】:

      猜你喜欢
      • 2019-07-24
      • 2021-11-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2022-10-02
      • 2021-07-01
      • 2019-02-28
      相关资源
      最近更新 更多