【发布时间】:2013-12-06 14:33:43
【问题描述】:
我正在编写一个使用消息队列的软件。 我有一个问题:
主进程创建 16 个儿子(使用 fork),每个儿子为下一个儿子写一条消息。然后,他们正在等待接收他们的消息。 (儿子“0”向儿子“1”发送消息,...,儿子“15”向儿子“0”发送消息)。
它在大多数情况下运行良好,但有时会发生一些奇怪的事情......一个进程从未收到它的消息,尽管它是由相应的儿子发送的!我会说它在 10 次成功后发生一次。
我已经能够写出一段有 bug 的代码:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <termios.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/msg.h>
struct buf
{
long mtype;
int data[32];
};
int main(int arc, char** argv)
{
int son = 0;
int pid = 0;
struct buf msgbuf;
key_t key;
key = ftok(argv[0], 'O');
int qid = msgget(key, IPC_CREAT | 0666);
if(qid < 0)
{
printf("Error\n");
return -1;
}
//Creates 16 sons
for(int i = 0; i < 16; i++)
{
pid = i;
son = fork();
if(son == 0)
break;
}
if(son == 0)
{
msgbuf.mtype = ((pid + 1) % 16) + 1;
for(int i = 0; i < 32; i++)
msgbuf.data[i] = pid;
printf("Writing %d\n", ((pid + 1) % 16) + 1);
msgsnd(qid, &msgbuf, 32 * sizeof(int), IPC_NOWAIT);
printf("Waiting for %d\n", pid + 1);
msgrcv(qid, &msgbuf, 32 * sizeof(int), pid + 1, 0);
printf("Got %d\n", (int)msgbuf.mtype);
}
sleep(3);
printf("----- END -----\n");
msgctl(qid, IPC_RMID, NULL);
return 0;
}
所以,预期的行为是这样的:
Writing 2
Writing 3
Waiting for 1
Waiting for 2
Got 2
Writing 4
Waiting for 3
Got 3
Writing 5
Waiting for 4
Got 4
Writing 6
Waiting for 5
Got 5
Writing 7
Waiting for 6
Got 6
Writing 8
Waiting for 7
Got 7
Writing 9
Waiting for 8
Got 8
Writing 10
Waiting for 9
Got 9
Writing 11
Waiting for 10
Got 10
Writing 12
Waiting for 11
Got 11
Writing 13
Waiting for 12
Got 12
Writing 14
Waiting for 13
Got 13
Writing 15
Waiting for 14
Got 14
Writing 16
Waiting for 15
Got 15
Writing 1
Waiting for 16
Got 16
Got 1
----- END -----
----- END -----
----- END -----
----- END -----
----- END -----
----- END -----
----- END -----
----- END -----
----- END -----
----- END -----
----- END -----
----- END -----
----- END -----
----- END -----
----- END -----
----- END -----
----- END -----
但有时,我有类似的东西:
Writing 2
Writing 3
Waiting for 1
Waiting for 2
Got 2
Writing 4
Waiting for 3
Got 3
Writing 5
Waiting for 4
Got 4
Writing 6
Waiting for 5
Got 5
Writing 7
Waiting for 6
Got 6
Writing 9
Waiting for 8
Writing 8
Waiting for 7
Got 7
Got 8
Writing 10
Waiting for 9
Got 9
Writing 11
Waiting for 10
Got 10
Writing 12
Waiting for 11
Got 11
Writing 13
Writing 14
Waiting for 12
Waiting for 13
Got 12
Writing 15
Waiting for 14
Got 14
Writing 16
Waiting for 15
Got 15
Writing 1
Waiting for 16
Got 16
Got 1
----- END -----
----- END -----
----- END -----
----- END -----
----- END -----
Got 14
----- END -----
----- END -----
----- END -----
----- END -----
----- END -----
----- END -----
----- END -----
----- END -----
----- END -----
----- END -----
----- END -----
如您所见,消息“14”从未收到,3 秒后,代码释放队列导致虚假“Got 14”。
在我的真实代码中,我使用信号量来确保程序只有在每个人都收到他的消息后才退出。这意味着发生了死锁。事实上,消息永远不会被接收,信号量永远不会“解锁”。所以这不是因为睡眠时间太短或类似的东西。这也不是因为我后来也删除了队列。
但不要忘记,大多数时候,这没关系!我不明白为什么有时儿子永远不会收到他的信息。
你能帮帮我吗?
【问题讨论】:
标签: c fork message-queue