【发布时间】:2020-08-06 19:27:57
【问题描述】:
我正在比较使用和不使用语言环境的矩阵乘法,并且我正在尝试使用稀疏矩阵来处理线性代数模块。我计划使用 blockdist 并使用循环手动将其分解,但我希望能够看看我现在是否可以使用更简单的东西来获得加速。如果有一种简单的方法可以使用我忽略的 blockdist,我将不胜感激。无论如何,当我只用一个值填充稀疏数组时,我能够让代码工作并看到加速,但是用随机值填充它似乎不起作用:
use LayoutCS;
use Time;
use LinearAlgebra, Norm;
use LinearAlgebra.Sparse;
use Random;
use IO;
writeln("Please type the filename with your matrix dimensions. One matrix on each line. The rows in the second need to match the columns in the first");
var filename: string;
filename = stdin.read(string);
// Open an input file with the specified filename in read mode.
var infile = open(filename, iomode.r);
var reader = infile.reader();
// Read the number of rows and columns in the array in from the file.
var r = reader.read(int), c = reader.read(int);
const parentDom = {1..r, 1..c};
var csrDom: sparse subdomain(parentDom) dmapped CS();
var A: [csrDom] real;
A = 2; //instead of this I would like to do something like fillRandom(A) but it seems to not work
var X: [1..r, 1..c] real;
fillRandom(X);
//read in the other matrix
var r1 = reader.read(int), c1 = reader.read(int);
const parentDom1 = {1..r1, 1..c1};
var csrDom1: sparse subdomain(parentDom1) dmapped CS();
var B: [csrDom1] real;
B = 3; //same thing as with matrix A
var Y: [1..r1, 1..c1] real;
fillRandom(Y);
// Close the file.
reader.close();
infile.close();
var t: Timer; //sets up timer
t.start();
var result: [1..r, 1..c1] real; //sets up matrix for results
forall i in 1..r do //goes through rows in 1st
for j in 1..c1 do //goes through 2nd matrix columns
for k in 1..c do { //goes through columns in 1st
result[i, j] += X[i, k] * Y[k, j]; //adds the multiplications to the new slot in results
}
t.stop();
writeln("multiplication took ", t.elapsed()," seconds");
t.clear();
t.start();
var res = A * B;
t.stop();
writeln("loc multiplication took ", t.elapsed()," seconds");
t.clear();
fillRandom 不适用于稀疏数组还是我做错了?我是否需要通过循环手动分配数组中的每个值?当然,我也有可能走错了路,应该更多地关注 blockdist在 blockdist 创建的正确语言环境部分上。
提前谢谢你!
【问题讨论】:
-
关于你更广泛的问题 - 你能澄清你在比较什么吗?分块分布式CSR矩阵-矩阵乘法是局部CSR矩阵-矩阵乘法吗?
-
我比较的是在一个语言环境上运行矩阵乘法与在多语言环境系统上运行它。我正在寻找实现该问题的多语言环境部分的最佳方法,无论是 blockdist 还是其他。
标签: matrix sparse-matrix chapel