并行运行作业答案

【问题标题】：Run jobs in parallel并行运行作业
【发布时间】：2023-03-29 04:21:01
【问题描述】：

我想并行运行大量密集型进程，其中我使用 for 循环遍历不同的参数。类似问题的许多答案都提到可以使用 xargs 来完成并行运行进程，但似乎没有一个人提到如果每个命令的参数发生变化，是否或如何做到这一点。

作为例子（伪代码）：

for paramA in 1 2 3
  for paramB in 1 2 3
    ./intensiveCommand $paramA $paramB
  end
end

我想并行化 intensiveCommand

或者有没有比使用 xargs 更简单的方法？

【问题讨论】：

您可以在命令末尾添加和签名，使用“intensiveCommand”。这将并行运行作业。
是的，但这并没有考虑负载，对吧？因此它将不断添加进程，因此它将开始交换并减慢进程。
在这种情况下，您可以从 uname 的输出中解析负载平均值并有条件地运行新进程。

标签： bash parallel-processing xargs

【解决方案1】：

您可以使用GNU parallel。它有一个--load 选项避免计算机超载。

parallel --load 100% ./intensiveCommand ::: 1 2 3 ::: 1 2 3

【讨论】：

【解决方案2】：

number_of_cores=4   #<-- number of processorcores, in my case: 4

for paramA in 1 2 3
do
    for paramB in 1 2 3
    do
        #========== automatic load regulator ==================
        sleep 1
        while [  $( pgrep -c "intensiveCommand" ) -ge "$number_of_cores" ]
        do
            kill -SIGSTOP $$
        done
        #======================================vvvvvvvvvvvvvvvv            

        ( ./intensiveCommand $paramA $paramB ; kill -SIGCONT $$ ) &

    done
done

如果运行的密集命令与核心数量一样多，则此程序将自己置于保持。一个完成的 intensiveCommand 将让程序 continue （参见 kill -SIGCONT $$ ）。程序再次检查并启动 intensiveCommands 直到再次达到最大 intensiveCommands 数时再次锁定。

睡眠是为了克服一个密集命令的启动和它出现在进程表中之间的延迟。

【讨论】：

对它的工作原理有什么解释吗？ pgrep -c "intensiveCommand" 计算当前运行的进程列表中“密集命令”的实例数...-ge 表示“大于或等于”...( ... ) & 行在后台启动某种新进程？除此之外，我不知道它是如何工作的

【解决方案3】：

“每个核心 1 个插槽”中的非常紧凑的调度，坚如磐石且简单。

#/bin/bash

#use the filedescriptor as a kind of queue to fill the processing slots.

exec 3< <(

    for PARAM_A in 1 2 3
    do
        for PARAM_B in 1 2 3
        do
             echo $PARAM_A $PARAM_B
        done
    done
)

#4 seperate processing slots running parallel 
while read -u 3 PARA PARB; do "intensiveCommand $PARA $PARB" ; done &
while read -u 3 PARA PARB; do "intensiveCommand $PARA $PARB" ; done &
while read -u 3 PARA PARB; do "intensiveCommand $PARA $PARB" ; done &
while read -u 3 PARA PARB; do "intensiveCommand $PARA $PARB" ; done &

#only exit when 100% sure that all processes ended
while pgrep "intensiveCommand" &>"/dev/null" ; do wait ; done

【讨论】：

【解决方案4】：

我写的这个很好用 - 阅读顶部的 cmets 以了解它是如何工作的。

#!/bin/bash
################################################################################
# File: core
# Author: Mark Setchell
# 
# Primitive, but effective tool for managing parallel execution of jobs in the
# shell. Based on, and requiring REDIS.
#
# Usage:
#
# core -i 8 # Initialise to 8 cores, or specify 0 to use all available cores
# for i in {0..63}
# do
#   # Wait for a core, do a process, release core
#   (core -p; process; core -v)&
# done
# wait
################################################################################
function usage {
    echo "Usage: core -i ncores # Initialise with ncores. Use 0 for all cores."
    echo "       core -p        # Wait (forever) for free core."
    echo "       core -v        # Release core."
    exit 1
}

function init {
    # Delete list of cores in REDIS
    echo DEL cores | redis-cli > /dev/null 2>&1
    for i in `seq 1 $NCORES`
    do
       # Add another core to list of cores in REDIS
       echo LPUSH cores 1 | redis-cli > /dev/null 2>&1
    done
    exit 0
}

function WaitForCore {
    # Wait forever for a core to be available
    echo BLPOP cores 0 | redis-cli > /dev/null 2>&1
    exit 0
}

function ReleaseCore {
    # Release or give back a core
    echo LPUSH cores 1 | redis-cli > /dev/null 2>&1
    exit 0
}

################################################################################
# Main
################################################################################
while getopts "i:pv" optname
  do
    case "$optname" in
      "i")
        if [ $OPTARG -lt 1 ]; then
           NCORES=`sysctl -n hw.logicalcpu`;    # May differ if not on OSX, maybe "nproc" on Linux
        else
           NCORES=$OPTARG
        fi
    init $NCORES
        ;;
      "p")
    WaitForCore
        ;;
      "v")
    ReleaseCore
        ;;
      "?")
        echo "Unknown option $OPTARG"
        ;;
    esac
done
usage

举例来说，以下需要 10 秒（不是 80 秒）来执行 16 次等待，每次等待 5 秒：

core -i 8 
for i in {0..15}
do
   # Wait for a core, do a process, release core
   (core -p ; sleep 5 ; core -v)&
done
wait

【讨论】：

是的，它真的很容易安装和运行。其中的关键功能是“BLPOP”，它会阻塞等待列表中的项目。这使我可以准确控制一次可以运行多少个进程。如果我想要 8 个，我会在开始时将 8 个项目放入列表中，每次我完成一项工作时拿一个，然后在完成后将其归还给列表。