异步清理子进程答案

【问题标题】：Cleaning up children processes asynchronously异步清理子进程
【发布时间】：2014-12-12 03:26:21
【问题描述】：

这是来自 Advanced Linux Programming> 的示例，第 3.4.4 章。程序 fork() 和 exec() 是一个子进程。我希望父进程异步清理子进程（否则子进程将成为僵尸进程），而不是等待进程终止。可以使用信号 SIGCHLD 来完成。通过设置 signal_handler，我们可以在子进程结束时完成清理工作。代码如下：

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/wait.h>
#include <signal.h>
#include <string.h>

int spawn(char *program, char **arg_list){
    pid_t child_pid;

     child_pid = fork();
     if(child_pid == 0){    // it is the child process
        execvp(program, arg_list);
        fprintf(stderr, "A error occured in execvp\n");
        return 0;
     }
     else{
        return child_pid;
     }
}

int child_exit_status;

void clean_up_child_process (int signal_number){
    int status;
    wait(&status);
    child_exit_status = status;     // restore the exit status in a global variable
    printf("Cleaning child process is taken care of by SIGCHLD.\n");
};

int main()
{
    /* Handle SIGCHLD by calling clean_up_process; */
    struct sigaction sigchld_action;
    memset(&sigchld_action, 0, sizeof(sigchld_action));
    sigchld_action.sa_handler = &clean_up_child_process;
    sigaction(SIGCHLD, &sigchld_action, NULL);

    int child_status;
    char *arg_list[] = {    //deprecated conversion from string constant to char*
        "ls", 
        "-la",
        ".",
        NULL
    };

    spawn("ls", arg_list);

    return 0;
}

但是，当我在终端中运行程序时，父进程永远不会结束。而且它似乎没有执行函数 clean_up_child_process （因为它没有打印出“清理子进程由 SIGCHLD 处理。”）。这段sn-p代码有什么问题？

【问题讨论】：

标签： c++ process signals

【解决方案1】：

适用于 GNU/Linux 用户

我已经读过这本书了。尽管本书将这种机制描述为：

引自本书第 59 页的 3.4.4：

更优雅的解决方案是在子进程终止时通知父进程。

但它只是说您可以使用sigaction 来处理这种情况。

这是一个如何以这种方式处理进程的完整示例。

首先，我们为什么要使用这种机制？好吧，因为我们不想将所有进程同步在一起。

真实示例
假设您有 10 个 .mp4 文件，并且您想将它们转换为 .mp3 文件。好吧，我初级用户这样做：

ffmpeg -i 01.mp4 01.mp3

并重复此命令 10 次。高一点的用户会这样做：

ls *.mp4 | xargs -I xxx ffmpeg -i xxx xxx.mp3

这一次，这个命令pipes每行所有10个mp4文件，每个一个接一个到xargs然后他们一个接一个被转换到mp3。

但我高级用户这样做：

ls *.mp4 | xargs -I xxx -P 0 ffmpeg -i xxx xxx.mp3

这意味着如果我有 10 个文件，创建 10 个进程 并同时运行它们。并且有BIG不同。在前两个命令中，我们只有 1 个进程；它被创建然后终止，然后继续到另一个。但是在-P 0选项的帮助下，我们同时创建了10个进程，实际上有10个ffmpeg命令正在运行。

现在异步清理子节点的目的变得更清晰了。事实上，我们想运行一些新进程，但这些进程的顺序以及它们的退出状态对我们来说并不重要。通过这种方式，我们可以尽可能快地运行它们并减少时间。

首先，您可以查看man sigaction 了解您想要的更多详细信息。

第二次看到这个信号编号：

T ❱ kill -l | grep SIGCHLD
16) SIGSTKFLT   17) SIGCHLD     18) SIGCONT     19) SIGSTOP     20) SIGTSTP

示例代码

目的：使用SIGCHLD清理子进程

#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <string.h>
#include <wait.h>
#include <unistd.h>

sig_atomic_t signal_counter;

void signal_handler( int signal_number )
{
    ++signal_counter;
    int wait_status;
    pid_t return_pid = wait( &wait_status );
    if( return_pid == -1 )
    {
        perror( "wait()" );
    }
    if( WIFEXITED( wait_status ) )
    {
        printf ( "job [ %d ] | pid: %d | exit status: %d\n",signal_counter, return_pid, WEXITSTATUS( wait_status ) );
    }
    else
    {
        printf( "exit abnormally\n" );
    }

    fprintf( stderr, "the signal %d was received\n", signal_number );
}

int main()
{
    // now instead of signal function we want to use sigaction
    struct sigaction siac;

    // zero it
    memset( &siac, 0, sizeof( struct sigaction ) );

    siac.sa_handler = signal_handler;
    sigaction( SIGCHLD, &siac, NULL );

    pid_t child_pid;

    ssize_t read_bytes = 0;
    size_t  length = 0;
    char*   line = NULL;

    char* sleep_argument[ 5 ] = { "3", "4", "5", "7", "9" };

    int counter = 0;

    while( counter <= 5 )
    {
        if( counter == 5 )
        {
            while( counter-- )
            {
                pause();
            }

            break;
        }

        child_pid = fork();

        // on failure fork() returns -1
        if( child_pid == -1 )
        {
            perror( "fork()" );
            exit( 1 );
        }

        // for child process fork() returns 0
        if( child_pid == 0 ){
            execlp( "sleep", "sleep", sleep_argument[ counter ], NULL );
        }

        ++counter;
    }

    fprintf( stderr, "signal counter %d\n", signal_counter );

    // the main return value
    return 0;
}

这就是示例代码的作用：

创建 5 个子进程
然后进入内部循环并暂停以接收信号。见man pause
然后当子进程终止时，父进程唤醒并调用signal_handler函数
继续到最后一个：sleep 9

输出：（17 表示SIGCHLD）

ALP ❱ ./a.out 
job [ 1 ] | pid: 14864 | exit status: 0
the signal 17 was received
job [ 2 ] | pid: 14865 | exit status: 0
the signal 17 was received
job [ 3 ] | pid: 14866 | exit status: 0
the signal 17 was received
job [ 4 ] | pid: 14867 | exit status: 0
the signal 17 was received
job [ 5 ] | pid: 14868 | exit status: 0
the signal 17 was received
signal counter 5

当您运行此示例代码时，在另一个终端上试试这个：

ALP ❱ ps -o time,pid,ppid,cmd --forest -g $(pgrep -x bash)
    TIME   PID  PPID CMD
00:00:00  5204  2738 /bin/bash
00:00:00  2742  2738 /bin/bash
00:00:00  4696  2742  \_ redshift
00:00:00 14863  2742  \_ ./a.out
00:00:00 14864 14863      \_ sleep 3
00:00:00 14865 14863      \_ sleep 4
00:00:00 14866 14863      \_ sleep 5
00:00:00 14867 14863      \_ sleep 7
00:00:00 14868 14863      \_ sleep 9

如您所见，a.out 进程有 5 个子进程。它们同时运行。然后每当它们每个都终止时，内核将信号SIGCHLD发送给它们的父级，即：a.out

注意

如果我们不使用pause 或任何机制让parent 可以wait 为其孩子，那么我们将放弃创建的进程和upstart (= on Ubuntu or init) 成为他们的父母。去掉pause()可以试试看

【讨论】：

【解决方案2】：

子进程从fork()返回后，父进程立即从main()返回，永远没有机会等待子进程终止。

【讨论】：

是的！这实际上解决了我的问题。我还注意到 fork() 和 exec() 函数的成本比我想象的要高，因为我必须在 'spawn()' 之后放置冗长的内容以使主程序在其子进程之后终止。

【解决方案3】：

我使用的是 Mac，所以我的回答可能不太相关，但仍然如此。我编译时没有任何选项，所以可执行文件名为a.out。

我对控制台有相同的体验（进程似乎没有终止），但我注意到这只是终端故障，因为您实际上只需按 Enter，您的命令行就会返回，实际上是 @987654322从其他终端窗口执行的@不显示a.out，也不显示它启动的ls。

另外，如果我运行./a.out >/dev/null，它会立即结束。

所以上面的重点是一切实际上都终止了，只是终端由于某种原因冻结了。

接下来，为什么它从不打印Cleaning child process is taken care of by SIGCHLD.。仅仅是因为父进程在子进程之前终止。 SIGCHLD 信号无法传递给已终止的进程，因此永远不会调用处理程序。

书中说父进程继续做一些其他事情，如果它真的做了，那么一切正常，例如如果你在spawn()之后添加sleep(1)。

【讨论】：