【发布时间】:2017-03-24 11:07:14
【问题描述】:
我正在学习 MPI_Send,但我对这种方法感到困惑。我写了一个简单的乒乓程序,rank-0节点将消息发送给rank-1节点,然后后一个节点将消息返回给前一个节点。
if (rank == 0) { /* Send Ping, Receive Pong */
dest = 2;
source = 2;
rc = MPI_Send(pingmsg, strlen(pingmsg)+1, MPI_CHAR, dest, tag, MPI_COMM_WORLD);
rc = MPI_Recv(buff, strlen(pongmsg)+1, MPI_CHAR, source, tag, MPI_COMM_WORLD, &Stat);
printf("Rank0 Sent: %s & Received: %s\n", pingmsg, buff);
}
else if (rank == 2) { /* Receive Ping, Send Pong */
dest = 0;
source = 0;
rc = MPI_Recv(buff, strlen(pingmsg)+1, MPI_CHAR, source, tag, MPI_COMM_WORLD, &Stat);
printf("Rank1 received: %s & Sending: %s\n", buff, pongmsg);
rc = MPI_Send(pongmsg, strlen(pongmsg)+1, MPI_CHAR, dest, tag, MPI_COMM_WORLD);
}
我在 3 个节点的环境中运行这个程序。但是,系统显示:
Fatal error in MPI_Send: Other MPI error, error stack:
MPI_Send(173)..............: MPI_Send(buf=0xbffffb90, count=10, MPI_CHAR, dest=2, tag=1, MPI_COMM_WORLD) failed
MPID_nem_tcp_connpoll(1811): Communication error with rank 2: Unknown error 4294967295
我想知道为什么我可以从 rank-0 节点发送消息到 rank-1 节点,但是从 rank-0 节点更改为 rank-1 节点时出现错误?谢谢。
【问题讨论】:
-
您正在使用类似 mpiexec -np 3 your-program-name 的东西运行它?运行 mpiexec -np 3 hostname 会发生什么?