【问题标题】:C, MPI threads comunication issueC、MPI线程通信问题
【发布时间】:2018-07-05 08:41:18
【问题描述】:

鉴于此结构:

typedef struct 
{
    double rx, ry, rz;
    double vx, vy, vz;
    double fx, fy, fz;
    double mass;
} Body;

我正在尝试通过 MPI 多线程接口传递它。它是一个自定义结构,所以我创建了一个 MPI 类型:

int bodyParamas=10;
int blocklengths[10] = {1,1,1,1,1,1,1,1,1,1};
MPI_Datatype types[10] = {MPI_DOUBLE, MPI_DOUBLE, MPI_DOUBLE, MPI_DOUBLE, MPI_DOUBLE, MPI_DOUBLE, MPI_DOUBLE, MPI_DOUBLE, MPI_DOUBLE, MPI_DOUBLE};
MPI_Datatype mpi_body_type;
MPI_Aint     offsets[10];
offsets[0] = offsetof(Body, rx);
offsets[1] = offsetof(Body, ry);
offsets[2] = offsetof(Body, rz);
offsets[3] = offsetof(Body, vx);
offsets[4] = offsetof(Body, vy);
offsets[5] = offsetof(Body, vz);
offsets[6] = offsetof(Body, fx);
offsets[7] = offsetof(Body, fy);
offsets[8] = offsetof(Body, fz);
offsets[9] = offsetof(Body, mass);
MPI_Type_create_struct(bodyParamas, blocklengths, offsets, types, &mpi_body_type);
MPI_Type_commit(&mpi_body_type);

然后在我的 for 循环中,我发送数据,并在其他线程中接收它(不同于根线程):

        if(my_id == root_process) {
            int starting_bodies_array_index = -1;
            for(an_id = 1; an_id < num_procs; an_id++) {
                start_body_index = an_id*num_of_bodies_per_process + 1;
                end_body_index = (an_id + 1)*num_of_bodies_per_process;

                num_of_bodies_to_send = end_body_index - start_body_index + 1;
                starting_bodies_array_index += num_of_bodies_to_send;

                ierr = MPI_Send( &starting_bodies_array_index, 1 , MPI_INT,
                      an_id, send_data_tag, MPI_COMM_WORLD);

                ierr = MPI_Send( &bodies[starting_bodies_array_index], num_of_bodies_to_send, mpi_body_type,
                      an_id, send_data_tag, MPI_COMM_WORLD);
            }
        }
        else {

            ierr = MPI_Recv(&num_of_bodies_to_recive, 1, MPI_INT, 
                   root_process, send_data_tag, MPI_COMM_WORLD, &status);

            ierr = MPI_Recv(&bodiesRecived, num_of_bodies_to_recive, mpi_body_type, 
                   root_process, send_data_tag, MPI_COMM_WORLD, &status);

            num_of_bodies_recived = num_of_bodies_to_recive;

        }

我不知道我的代码有什么问题。我很确定,我的自定义 MPI 类型是正确的,错误中没有提到它。 这是我看到的错误:

*** An error occurred in MPI_Recv
*** reported by process [1580531713,1]
*** on communicator MPI_COMM_WORLD
*** MPI_ERR_TRUNCATE: message truncated
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)

有人发现有什么不对吗?

【问题讨论】:

    标签: c multithreading mpi


    【解决方案1】:

    根本原因是你MPI_Send()num_of_bodies_to_send元素,但是你MPI_Recv()starting_bodies_array_index元素代替。

    您应该将第一个 MPI_Send() 替换为

    ierr = MPI_Send( &num_of_bodies_to_send, 1 , MPI_INT,
                     an_id, send_data_tag, MPI_COMM_WORLD);
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2013-02-07
      • 2017-11-15
      • 2016-06-29
      • 2021-09-15
      • 1970-01-01
      • 2019-09-15
      • 2020-02-26
      • 1970-01-01
      相关资源
      最近更新 更多