MPI 一次对一个巨大矩阵的一部分进行所有操作答案

【问题标题】：MPI all to all operation with one part of a huge matrix at a timeMPI 一次对一个巨大矩阵的一部分进行所有操作
【发布时间】：2015-10-14 22:18:07
【问题描述】：

我有一个分布在 4 个节点上的矩阵，我希望每个节点都发送它的矩阵部分，并且一次一个地接收来自其他节点的矩阵的每个其他部分。块矩阵具有不同的维度。

我尝试编写一些代码，但没有按预期工作。

  /* send my part of the matrix */
  for (int i = 0; i < numtasks; i++){
    if (i == taskid) continue;

    MPI_Isend(matrix_block, size, MPI_INT, i, 0,
              MPI_COMM_WORLD, &rNull);
  }

  /* receive everyone's part of the matrix */
  for (int i = 0; i < numtasks; i++){
    if (i == taskid) continue;

    MPI_Irecv(brec, lenghts_recv[i], MPI_INT, i, 0,
              MPI_COMM_WORLD, &request[i]);
  }

  for (int i = 0; i < numtasks - 1; i++){
    int index;
    MPI_Waitany(numtasks-1, request, &index, &status);
  }

我以为每个节点都会首先发送它拥有的块，然后它会接收其他节点发送给他的块，但显然它是错误的。

此外，像 MPI_Alltoall 这样的解决方案在我的情况下不起作用，因为它应该是巨大的矩阵并且它不适合一个节点。

您能否建议我一种方法来执行所有操作，但一次只使用一个矩阵的一部分？

【问题讨论】：

标签： c++ mpi

【解决方案1】：

您可以使用MPI_Bcast 让四个节点中的每一个都将其矩阵部分发送给其他三个节点。这样，您可以将 all-to-all 操作拆分为几个可以与计算交错的 one-to-all 操作。

所以基本上，你可以这样做：

 for (int i = 0; i < numtasks; i++){
     /* process i sends the data in matrix_block to all other processes. This is a 
        collective operation, i.e., after the operation, every process will have 
        already received the data into matrix_block. */
     MPI_Bcast(matrix_block, size, MPI_INT, i, MPI_COMM_WORLD);

     //TODO: do all necessary computation on this part of the matrix */
}

我不确定您的代码是如何工作的以及所有变量是什么，所以我无法为您提供更具体的信息。如果您使用最低限度的工作示例更新您的问题，我可能会提供更多帮助。

您可以在this excellent answer 中找到使用MPI_Bcast 的示例。

【讨论】：