MPI_Isend 重用内部缓冲区答案

【问题标题】：MPI_Isend reusing internal bufferMPI_Isend 重用内部缓冲区
【发布时间】：2014-09-02 15:14:07
【问题描述】：

我有一个使用阻塞接收和非阻塞发送的有限元代码。每个元素有 3 个传入面和 3 个传出面。网格在许多处理器之间拆分，因此有时边界条件来自元素处理器或相邻处理器。代码的相关部分是：

std::vector<task>::iterator it = All_Tasks.begin();
std::vector<task>::iterator it_end = All_Tasks.end();
int task = 0;
for (; it != it_end; it++, task++)
{
  for (int f = 0; f < 3; f++)
    {
        // Get the neighbors for each incoming face
        Neighbor neighbor = subdomain.CellSets[(*it).cellset_id_loc].neighbors[incoming[f]];

        // Get buffers from boundary conditions or neighbor processors
        if (neighbor.processor == rank)
        {
            subdomain.Set_buffer_from_bc(incoming[f]);
        }
        else
        {
            // Get the flag from the corresponding send
            target = GetTarget((*it).angle_id, (*it).group_id, (*it).cell_id);
            if (incoming[f] == x)
            {
                int size = cells_y*cells_z*groups*angles*4;
                MPI_Status status;
                MPI_Recv(&subdomain.X_buffer[0], size, MPI_DOUBLE, neighbor.processor, target, MPI_COMM_WORLD, &status);
            }
            if (incoming[f] == y)
            {
                int size = cells_x*cells_z*groups*angles * 4;
                MPI_Status status;
                MPI_Recv(&subdomain.Y_buffer[0], size, MPI_DOUBLE, neighbor.processor, target, MPI_COMM_WORLD, &status);
            }
            if (incoming[f] == z)
            {
                int size = cells_x*cells_y*groups*angles * 4;
                MPI_Status status;
                MPI_Recv(&subdomain.Z_buffer[0], size, MPI_DOUBLE, neighbor.processor, target, MPI_COMM_WORLD, &status);
            }
        }
    }

    ... computation ...

    for (int f = 0; f < 3; f++)
    {
        // Get the outgoing neighbors for each face
        Neighbor neighbor = subdomain.CellSets[(*it).cellset_id_loc].neighbors[outgoing[f]];

        if (neighbor.IsOnBoundary)
        {
            // store the buffer into the boundary information
        }
        else
        {
            target = GetTarget((*it).angle_id, (*it).group_id, neighbor.cell_id);
            if (outgoing[f] == x)
            {
                int size = cells_y*cells_z*groups*angles * 4;
                MPI_Request request;
                MPI_Isend(&subdomain.X_buffer[0], size, MPI_DOUBLE, neighbor.processor, target, MPI_COMM_WORLD, &request);
            }
            if (outgoing[f] == y)
            {
                int size = cells_x*cells_z*groups*angles * 4;
                MPI_Request request;
                MPI_Isend(&subdomain.Y_buffer[0], size, MPI_DOUBLE, neighbor.processor, target, MPI_COMM_WORLD, &request);
            }
            if (outgoing[f] == z)
            {
                int size = cells_x*cells_y*groups*angles * 4;
                MPI_Request request;
                MPI_Isend(&subdomain.Z_buffer[0], size, MPI_DOUBLE, neighbor.processor, target, MPI_COMM_WORLD, &request);

            }
        }

    }
}

一个处理器在需要来自其他处理器的信息之前可以完成很多任务。我需要一个非阻塞发送，这样代码才能继续工作，但我很确定接收端在发送之前会覆盖发送缓冲区。

我已尝试对此代码进行计时，但调用 MPI_Recv 需要 5-6 秒，即使它尝试接收的消息已发送。我的理论是 Isend 正在启动，但在调用 Recv 之前实际上并没有发送任何东西。消息本身大约为 1 MB。我已经查看了基准测试，并且这种大小的消息应该只需要很少的几分之一秒即可发送。

我的问题是，在这段代码中，发送的缓冲区是否被覆盖，或者只是本地副本？有没有办法在我发送时“添加”到缓冲区，而不是写入相同的内存位置？我希望 Isend 每次调用时都写入不同的缓冲区，以便在等待接收消息时不会覆盖信息。

** 编辑 ** 一个可能解决我的问题的相关问题：MPI_Test 或 MPI_Wait 能否提供有关 MPI_Isend 写入缓冲区的信息，即如果 Isend 已写入缓冲区但尚未接收到该缓冲区，则返回 true？

** 编辑 2 ** 我添加了有关我的问题的更多信息。

【问题讨论】：

一些有帮助的细节：Recv 和 Isend 的参数细节、您正在运行的 MPI 实现以及系统类型。听起来您在怀疑与缓冲或 MPI 实现的进度引擎有关的东西——在您调用 Isend 后实际上将消息推送到线路上的东西。你也可以在等待 Recv 的队列之间建立一个长的进程间依赖链。很难说 - 更多细节可能会有所帮助。

标签： c++ buffer mpi send nonblocking

【解决方案1】：

所以看起来我只需要硬着头皮在发送缓冲区中分配足够的内存来容纳所有消息，然后在发送时只发送缓冲区的一部分。

【讨论】：