【问题标题】:Python3 calling Python2 multiprocessing in Linux acts differently than in WindowsPython3 在 Linux 中调用 Python2 多处理的行为与在 Windows 中不同
【发布时间】:2023-03-17 00:55:06
【问题描述】:

我有一个在Python3.8 中运行的简单示例代码,它打开一个在Python2.7 中执行的subprocess(利用多处理)。

Windows 10 中,我的代码行为就是我的意图。 Python2 池在哪里运行并相应地打印到stdout。并且main.py 几乎立即读取标准输出,就像池在其上写入一样。

不幸的是,我在 Linux (Ubuntu 20.04.1 LTS) 上看到了不同的结果。看来,在 Linux 中,在整个池完成之前我不会得到任何回报。

我怎样才能使代码在 Linux 中也一样工作?

请查看下面的简单示例代码和我得到的输出。

main.py

import subprocess
import datetime
import tempfile
import os

def get_time():
    return datetime.datetime.now()

class ProcReader():
    def __init__(self, python_file, temp=None, wait=False):
        self.proc = subprocess.Popen(['python2', python_file], stdout=subprocess.PIPE)

    def __iter__(self):
        return self

    def __next__(self):
        while True:
            line = self.proc.stdout.readline()
            if not line:
                raise StopIteration
            return line

if __name__ == "__main__":
    r1 = ProcReader("p2.py")

    for l1 in r1:
        print("Main reading at: {} for {}".format(get_time(), l1))

p2.py

import time
import multiprocessing as mp
from multiprocessing import freeze_support
import datetime

def get_time():
    return datetime.datetime.now()

def f1(name):
    for x in range(2):
        time.sleep(1)
        print("{} Job#: {} from f1".format(get_time(), name))

def f2(name):
    for x in range(2):
        time.sleep(2)
        print("{} Job#: {} from f2".format(get_time(), name))

if __name__ == '__main__':
    freeze_support()

    pool = mp.Pool(2)
    tasks = ["1", "2", "3", "4", "5", "6", "7"]
    for i, task in enumerate(tasks):
        if i%2:
            pool.apply_async(f2, args=(task,))
        else:
            pool.apply_async(f1, args=(task,))

    pool.close()
    pool.join()

Windows 输出:

Main reading at: 2020-09-24 15:28:19.044626 for b'2020-09-24 15:28:19.044000 Job#: 1 from f1\n'
Main reading at: 2020-09-24 15:28:20.045454 for b'2020-09-24 15:28:20.045000 Job#: 1 from f1\n'
Main reading at: 2020-09-24 15:28:20.046711 for b'2020-09-24 15:28:20.046000 Job#: 2 from f2\n'
Main reading at: 2020-09-24 15:28:21.045510 for b'2020-09-24 15:28:21.045000 Job#: 3 from f1\n'
Main reading at: 2020-09-24 15:28:22.046334 for b'2020-09-24 15:28:22.046000 Job#: 3 from f1\n'
Main reading at: 2020-09-24 15:28:22.047368 for b'2020-09-24 15:28:22.047000 Job#: 2 from f2\n'
Main reading at: 2020-09-24 15:28:23.047519 for b'2020-09-24 15:28:23.047000 Job#: 5 from f1\n'
Main reading at: 2020-09-24 15:28:24.046356 for b'2020-09-24 15:28:24.046000 Job#: 4 from f2\n'
Main reading at: 2020-09-24 15:28:24.048356 for b'2020-09-24 15:28:24.048000 Job#: 5 from f1\n'
Main reading at: 2020-09-24 15:28:26.047307 for b'2020-09-24 15:28:26.047000 Job#: 4 from f2\n'
Main reading at: 2020-09-24 15:28:26.049168 for b'2020-09-24 15:28:26.049000 Job#: 6 from f2\n'
Main reading at: 2020-09-24 15:28:27.047897 for b'2020-09-24 15:28:27.047000 Job#: 7 from f1\n'
Main reading at: 2020-09-24 15:28:28.048337 for b'2020-09-24 15:28:28.048000 Job#: 7 from f1\n'
Main reading at: 2020-09-24 15:28:28.049367 for b'2020-09-24 15:28:28.049000 Job#: 6 from f2\n'

Linux 的输出:

Main reading at: 2020-09-24 19:28:45.972346 for b'2020-09-24 19:28:36.932473 Job#: 1 from f1\n'
Main reading at: 2020-09-24 19:28:45.972559 for b'2020-09-24 19:28:37.933594 Job#: 1 from f1\n'
Main reading at: 2020-09-24 19:28:45.972585 for b'2020-09-24 19:28:38.935255 Job#: 3 from f1\n'
Main reading at: 2020-09-24 19:28:45.972597 for b'2020-09-24 19:28:39.936297 Job#: 3 from f1\n'
Main reading at: 2020-09-24 19:28:45.972685 for b'2020-09-24 19:28:40.937666 Job#: 5 from f1\n'
Main reading at: 2020-09-24 19:28:45.972711 for b'2020-09-24 19:28:41.938629 Job#: 5 from f1\n'
Main reading at: 2020-09-24 19:28:45.972724 for b'2020-09-24 19:28:43.941109 Job#: 6 from f2\n'
Main reading at: 2020-09-24 19:28:45.972735 for b'2020-09-24 19:28:45.943310 Job#: 6 from f2\n'
Main reading at: 2020-09-24 19:28:45.973115 for b'2020-09-24 19:28:37.933317 Job#: 2 from f2\n'
Main reading at: 2020-09-24 19:28:45.973139 for b'2020-09-24 19:28:39.935938 Job#: 2 from f2\n'
Main reading at: 2020-09-24 19:28:45.973149 for b'2020-09-24 19:28:41.938587 Job#: 4 from f2\n'
Main reading at: 2020-09-24 19:28:45.973157 for b'2020-09-24 19:28:43.941109 Job#: 4 from f2\n'
Main reading at: 2020-09-24 19:28:45.973165 for b'2020-09-24 19:28:44.942306 Job#: 7 from f1\n'
Main reading at: 2020-09-24 19:28:45.973173 for b'2020-09-24 19:28:45.943503 Job#: 7 from f1\n'

请忽略时间,因为时钟不同,但正如您所见,在 Windows 中,main.py 在写入 python2 池后立即获取它,但对于linuxmain.py 中的所有内容仅在以下情况下写入所有的工作都完成了。我不太关心作业完成的顺序,我真的只是希望main.py 在 Python2 池中写入后尽快获得stdout

【问题讨论】:

    标签: python linux windows multiprocessing subprocess


    【解决方案1】:

    Linux 上的stdout 被缓冲,多处理上的print() 未被刷新,因为进程不控制终端。

    sys.stdout 的猴子补丁在这里很有用

    import sys,os
    unbuffered = os.fdopen(sys.stdout.fileno(), 'w', 0)
    sys.stdout = unbuffered
    

    或者您可能必须在每个print() 之后调用sys.stdout.flush()

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2020-09-30
      • 2017-04-26
      • 1970-01-01
      • 1970-01-01
      • 2017-11-22
      • 2020-09-03
      • 1970-01-01
      • 2019-02-27
      相关资源
      最近更新 更多