使用 POpen 将变量发送到 Stdin 并将 Stdout 发送到变量答案

【问题标题】：Using POpen to send a variable to Stdin and to send Stdout to a variable使用 POpen 将变量发送到 Stdin 并将 Stdout 发送到变量
【发布时间】：2013-12-26 18:30:19
【问题描述】：

在shell脚本中，我们有以下命令：

/script1.pl < input_file| /script2.pl > output_file

我想使用模块subprocess 在 Python 中复制上述流。 input_file 是一个大文件，我无法一次读取整个文件。因此，我想将每一行，一个input_string 传递到管道流中并返回一个字符串变量output_string，直到整个文件被流式传输。

以下是第一次尝试：

process = subprocess.Popen(["/script1.pl | /script2.pl"], stdin = subprocess.PIPE, stdout = subprocess.PIPE, shell = True)
process.stdin.write(input_string)
output_string = process.communicate()[0]

但是，使用 process.communicate()[0] 会关闭流。我想保持直播以供以后直播。我尝试使用process.stdout.readline()，但程序挂起。

【问题讨论】：

/script1.pl < input_string 读取名为input_string 的文件，它不会将文字字符串input_string 作为输入。
啊，我明白了。不过，我想为我的 python 实现提供一个实际的字符串。我将使用生成器遍历字符串，并且我想通过管道传递生成的字符串。
您的 shell 命令与 “保持流打开” 不兼容。您想在output_string 中放入什么（第一个字节、第一行、前 n 个字节、10 秒内到达的第一个字节）？顺便说一句，output_string = process.communicate(input_string)[0] 重现了您的 shell 命令（如果我们使用字符串而不是文件）。
我很抱歉造成混乱。我的 shell 命令从一个包含很多行的大文件中读取，然后写入另一个文件。我无法在 python 中打开和读取整个文件。相反，我必须逐行阅读，并将每一行传递到管道流中。我想保持管道流打开，直到所有线路都通过它。
编辑了我的问题以澄清问题。谢谢。

标签： python subprocess pipe

【解决方案1】：

在 Python 中使用subprocess 模块模拟/script1.pl < input_file | /script2.pl > output_file shell 命令：

#!/usr/bin/env python
from subprocess import check_call

with open('input_file', 'rb') as input_file
    with open('output_file', 'wb') as output_file:
        check_call("/script1.pl | /script2.pl", shell=True,
                   stdin=input_file, stdout=output_file)

你可以在没有shell=True 的情况下编写它（尽管我在这里看不到原因）基于17.1.4.2. Replacing shell pipeline example from the docs：

#!/usr/bin/env python
from subprocess import Popen, PIPE

with open('input_file', 'rb') as input_file
    script1 = Popen("/script1.pl", stdin=input_file, stdout=PIPE)
with open("output_file", "wb") as output_file:
    script2 = Popen("/script2.pl", stdin=script1.stdout, stdout=output_file)
script1.stdout.close() # allow script1 to receive SIGPIPE if script2 exits
script2.wait()
script1.wait()

你也可以使用plumbum module to get shell-like syntax in Python:

#!/usr/bin/env python
from plumbum import local

script1, script2 = local["/script1.pl"], local["/script2.pl"]
(script1 < "input_file" | script2 > "output_file")()

另见How do I use subprocess.Popen to connect multiple processes by pipes?

如果您想逐行读/写，那么答案取决于您要运行的具体脚本。一般来说，如果你不小心，很容易死锁发送/接收输入/输出，例如，由于buffering issues。

如果在您的情况下输入不依赖于输出，那么可靠的跨平台方法是为每个流使用单独的线程：

#!/usr/bin/env python
from subprocess import Popen, PIPE
from threading import Thread

def pump_input(pipe):
    try:
       for i in xrange(1000000000): # generate large input
           print >>pipe, i
    finally:
       pipe.close()

p = Popen("/script1.pl | /script2.pl", shell=True, stdin=PIPE, stdout=PIPE,
          bufsize=1)
Thread(target=pump_input, args=[p.stdin]).start()
try: # read output line by line as soon as the child flushes its stdout buffer
    for line in iter(p.stdout.readline, b''):
        print line.strip()[::-1] # print reversed lines
finally:
    p.stdout.close()
    p.wait()

【讨论】：