逐行读取子进程标准输出答案

【问题标题】：read subprocess stdout line by line逐行读取子进程标准输出
【发布时间】：2018-06-19 09:15:33
【问题描述】：

我的 python 脚本使用 subprocess 来调用一个非常嘈杂的 linux 实用程序。我想将所有输出存储到一个日志文件中，并将其中的一些显示给用户。我认为以下方法可行，但在实用程序产生大量输出之前，输出不会显示在我的应用程序中。

#fake_utility.py, just generates lots of output over time
import time
i = 0
while True:
   print hex(i)*512
   i += 1
   time.sleep(0.5)

#filters output
import subprocess
proc = subprocess.Popen(['python','fake_utility.py'],stdout=subprocess.PIPE)
for line in proc.stdout:
   #the real code does filtering here
   print "test:", line.rstrip()

我真正想要的行为是让过滤器脚本打印从子进程接收到的每一行。有点像 tee 所做的，但使用的是 python 代码。

我错过了什么？这甚至可能吗？

更新：

如果将sys.stdout.flush() 添加到 fake_utility.py，则代码在 python 3.1 中具有所需的行为。我正在使用python 2.6。你会认为使用proc.stdout.xreadlines() 会和py3k 一样工作，但事实并非如此。

更新 2：

这是最小的工作代码。

#fake_utility.py, just generates lots of output over time
import sys, time
for i in range(10):
   print i
   sys.stdout.flush()
   time.sleep(0.5)

#display out put line by line
import subprocess
proc = subprocess.Popen(['python','fake_utility.py'],stdout=subprocess.PIPE)
#works in python 3.0+
#for line in proc.stdout:
for line in iter(proc.stdout.readline,''):
   print line.rstrip()

【问题讨论】：

您可以使用print line, 代替print line.rstrip()（注意：末尾的逗号）。
相关：Python: read streaming input from subprocess.communicate()
更新 2 声明它适用于 python 3.0+，但使用旧的 print 语句，因此它不适用于 python 3.0+。
这里列出的答案都没有对我有用，但 *.com/questions/5411780/… 对我有用！
有趣的代码只适用于 python3.0+ 使用 2.7 语法进行打印。

标签： python subprocess

【解决方案1】：

自从我上次使用 Python 以来已经有很长时间了，但我认为问题出在语句 for line in proc.stdout 上，它在迭代之前读取整个输入。解决方案是改用readline()：

#filters output
import subprocess
proc = subprocess.Popen(['python','fake_utility.py'],stdout=subprocess.PIPE)
while True:
  line = proc.stdout.readline()
  if not line:
    break
  #the real code does filtering here
  print "test:", line.rstrip()

当然你还是要处理子进程的缓冲。

注意：according to the documentation 带有迭代器的解决方案应该等同于使用 readline()，除了预读缓冲区，但是（或者正因为如此）提议的更改确实对我产生了不同的结果（Python 2.5在 Windows XP 上）。

【讨论】：

对于 file.readline() 与 for line in file 参见 bugs.python.org/issue3907（简而言之：它适用于 Python3；在 Python 2.6+ 上使用 io.open()）
根据 PEP 8 (python.org/dev/peps/pep-0008) 中的“编程建议”，对 EOF 的更 Pythonic 测试将是 'if not line:'。
@naxa：对于管道：for line in iter(proc.stdout.readline, ''):.
@Jan-PhilipGehrcke：是的。 1. 你可以在 Python 3 上使用for line in proc.stdout（没有预读错误） 2. 在 Python 3 上使用'' != b''——不要盲目地复制粘贴代码——想想它的作用和工作原理.
我建议在破解之前添加sys.stdout.flush()，否则会混淆。

【解决方案2】：

确实，如果您整理出迭代器，那么缓冲现在可能是您的问题。你可以告诉子进程中的python不要缓冲它的输出。

proc = subprocess.Popen(['python','fake_utility.py'],stdout=subprocess.PIPE)

变成

proc = subprocess.Popen(['python','-u', 'fake_utility.py'],stdout=subprocess.PIPE)

我在从 python 中调用 python 时需要这个。

【讨论】：

【解决方案3】：

您想将这些额外参数传递给subprocess.Popen：

bufsize=1, universal_newlines=True

然后你可以像你的例子一样迭代。（使用 Python 3.5 测试）

【讨论】：

@nicoulaj 如果使用 subprocess32 包，它应该可以工作。

【解决方案4】：

聚会迟到了，但很惊讶没有看到我认为最简单的解决方案：

import io
import subprocess

proc = subprocess.Popen(["prog", "arg"], stdout=subprocess.PIPE)
for line in io.TextIOWrapper(proc.stdout, encoding="utf-8"):  # or another encoding
    # do something with line

（这需要 Python 3。）

【讨论】：

我想使用这个答案，但我得到：AttributeError: 'file' object has no attribute 'readable'py2.7
适用于 python 3
@sorin 这些东西都没有使它“无效”。如果您正在编写仍需要支持 Python 2 的库，请不要使用此代码。但许多人拥有能够使用比十年前更近发布的软件的奢侈。如果您尝试读取已关闭的文件，无论您是否使用TextIOWrapper，都会收到该异常。您可以简单地处理异常。
你可能迟到了，但你的回答是最新版本的 Python，ty
@Ammad \n 是换行符。在 Python 中，按行拆分时不删除换行符是常规的 - 如果您遍历文件的行或使用 readlines() 方法，您将看到相同的行为。你可以用line[:-1] 得到没有它的行（TextIOWrapper 默认在“通用换行符”模式下运行，所以即使你在 Windows 上并且行以\r\n 结尾，你也只会有\n at结束，所以-1 有效）。如果您不介意行尾的任何其他类似空格的字符也被删除，您也可以使用line.rstrip()。

【解决方案5】：

Rômulo 的答案的以下修改适用于 Python 2 和 3（2.7.12 和 3.6.1）：

import os
import subprocess

process = subprocess.Popen(command, stdout=subprocess.PIPE)
while True:
  line = process.stdout.readline()
  if line != '':
    os.write(1, line)
  else:
    break

【讨论】：

【解决方案6】：

我用 python3 试过了，它成功了，source

def output_reader(proc):
    for line in iter(proc.stdout.readline, b''):
        print('got line: {0}'.format(line.decode('utf-8')), end='')


def main():
    proc = subprocess.Popen(['python', 'fake_utility.py'],
                            stdout=subprocess.PIPE,
                            stderr=subprocess.STDOUT)

    t = threading.Thread(target=output_reader, args=(proc,))
    t.start()

    try:
        time.sleep(0.2)
        import time
        i = 0

        while True:
        print (hex(i)*512)
        i += 1
        time.sleep(0.5)
    finally:
        proc.terminate()
        try:
            proc.wait(timeout=0.2)
            print('== subprocess exited with rc =', proc.returncode)
        except subprocess.TimeoutExpired:
            print('subprocess did not terminate in time')
    t.join()

【讨论】：

【解决方案7】：

您还可以阅读不带循环的行。在python3.6中工作。

import os
import subprocess

process = subprocess.Popen(command, stdout=subprocess.PIPE)
list_of_byte_strings = process.stdout.readlines()

【讨论】：

或者转成字符串：list_of_strings = [x.decode('utf-8').rstrip('\n') for x in iter(process.stdout.readlines())]
@ndtreviv，您可以将 text=True 传递给 Popen，如果您希望输出为字符串，则可以使用其“编码”kwarg，无需自己转换

【解决方案8】：

允许同时实时逐行迭代stdout 和stderr 的函数

如果您需要同时获取stdout 和stderr 的输出流，可以使用以下函数。

该函数使用队列将两个 Popen 管道合并为一个迭代器。

这里我们创建函数read_popen_pipes()：

from queue import Queue, Empty
from concurrent.futures import ThreadPoolExecutor


def enqueue_output(file, queue):
    for line in iter(file.readline, ''):
        queue.put(line)
    file.close()


def read_popen_pipes(p):

    with ThreadPoolExecutor(2) as pool:
        q_stdout, q_stderr = Queue(), Queue()

        pool.submit(enqueue_output, p.stdout, q_stdout)
        pool.submit(enqueue_output, p.stderr, q_stderr)

        while True:

            if p.poll() is not None and q_stdout.empty() and q_stderr.empty():
                break

            out_line = err_line = ''

            try:
                out_line = q_stdout.get_nowait()
            except Empty:
                pass
            try:
                err_line = q_stderr.get_nowait()
            except Empty:
                pass

            yield (out_line, err_line)

read_popen_pipes() 正在使用中：

import subprocess as sp


with sp.Popen(my_cmd, stdout=sp.PIPE, stderr=sp.PIPE, text=True) as p:

    for out_line, err_line in read_popen_pipes(p):

        # Do stuff with each line, e.g.:
        print(out_line, end='')
        print(err_line, end='')

    return p.poll() # return status-code

【讨论】：

【解决方案9】：

Pythont 3.5 将 run() 和 call() 方法添加到 subprocess 模块，两者都返回 CompletedProcess 对象。有了这个你就可以使用proc.stdout.splitlines()：

proc = subprocess.run( comman, shell=True, capture_output=True, text=True, check=True )
for line in proc.stdout.splitlines():
   print "stdout:", line

另见How to Execute Shell Commands in Python Using the Subprocess Run Method

【讨论】：

此解决方案简短有效。与原始问题相比，一个问题是：它不会在“收到时”打印每一行，我认为这意味着实时打印消息，就像直接在命令行中运行进程一样。相反，它只在进程完成运行后打印输出。