将打印输出重定向到 Python 中的 .txt 文件答案

【问题标题】：Redirecting the print output to a .txt file in Python将打印输出重定向到 Python 中的 .txt 文件
【发布时间】：2016-03-23 08:05:25
【问题描述】：

我是 Python 的初学者。我在这个问题上尝试了许多来自 stackoverflow 答案的方法，但它们都不适用于我的脚本。
我有这个小脚本可以使用，但是我无法将巨大的结果保存到 .txt 文件中，因此我可以分析数据。如何将打印输出重定向到计算机上的 txt 文件？

from nltk.util import ngrams
import collections

with open("text.txt", "rU") as f:
    sixgrams = ngrams(f.read().decode('utf8').split(), 2)

result = collections.Counter(sixgrams)
print result
for item, count in sorted(result.iteritems()):
    if count >= 2:
        print " ".join(item).encode('utf8'), count

【问题讨论】：

如果你完全是 Python 的初学者，尤其是你似乎在做 NLP，我建议你直接切换到 Python 3！

标签： python parsing text

【解决方案1】：

只需在命令行上执行：python script.py > text.txt

【讨论】：

【解决方案2】：

print statement in Python 2.x支持重定向（>> fileobj）：

...
with open('output.txt', 'w') as f:
    print >>f, result
    for item, count in sorted(result.iteritems()):
        if count >= 2:
            print >>f, " ".join(item).encode('utf8'), count

在 Python 3.x 中，print function 接受可选的关键字参数file：

print("....", file=f)

如果您在 Python 2.6+ 中使用 from __future__ import print_function，即使在 Python 2.x 中也可以使用上述方法。

【讨论】：

【解决方案3】：

使用 BufferedWriter 你可以这样做

os = io.BufferedWriter(io.FileIO(pathOut, "wb"))
os.write( result+"\n")
for item, count in sorted(result.iteritems()):
     if count >= 2:
     os.write(" ".join(item).encode('utf8')+ str(count)+"\n")

outs.flush()
outs.close()

【讨论】：

【解决方案4】：

正如 Antti 提到的，你应该更喜欢 python3 并让所有这些烦人 python2垃圾在你身后。以下脚本适用于 python2 和 python3。

要读取/写入文件，请使用 io 模块中的 open 函数，这是 python2/python3 兼容。始终使用with 语句打开文件等资源。 with 用于将块的执行包装在Python Context Manager 中。文件描述符有上下文管理器实现，离开with块时会自动关闭。

不依赖python，如果你想读取一个文本文件，你应该知道此文件的编码以正确读取它（如果您不确定尝试utf-8 第一的）。此外，正确的 UTF-8 签名是 utf-8 和模式 U 是被贬低了。

#!/usr/bin/env python
# -*- coding: utf-8; mode: python -*-

from nltk.util import ngrams
import collections
import io, sys

def main(inFile, outFile):

    with io.open(inFile, encoding="utf-8") as i:
        sixgrams = ngrams(i.read().split(), 2)

    result = collections.Counter(sixgrams)
    templ = "%-10s %s\n"

    with io.open(outFile, "w", encoding="utf-8") as o:

        o.write(templ %  (u"count",  u"words"))
        o.write(templ %  (u"-" * 10, u"-" * 30))

        # Sorting might be expensive. Before sort, filter items you don't want
        # to handle, btw. place *count* in front of the tuple.

        filtered = [ (c, w) for w, c in result.items() if c > 1]
        filtered.sort(reverse=True)

        for count, item in filtered:
            o.write(templ % (count, " ".join(item)))

if __name__ == '__main__':
    sys.exit(main("text.txt", "out_text.txt"))

输入text.txt文件：

At eight o'clock on Thursday morning and Arthur didn't feel very good 
he missed 100 € on Thursday morning. The Euro symbol of 100 € is here
to test the encoding of non ASCII characters, because encoding errors
do occur only on Thursday morning.

我收到以下output_text：

count      words
---------- ------------------------------
3          on Thursday
2          Thursday morning.
2          100 €

【讨论】：