【问题标题】:Writing to txt file in UTF-8 - Python以 UTF-8 写入 txt 文件 - Python
【发布时间】:2020-08-09 07:49:31
【问题描述】:

我的 django 应用程序从用户那里获取文档,创建了一些关于它的报告,然后写入 txt 文件。有趣的问题是一切都在我的 Mac OS 上运行良好。但在 Windows 上,它无法读取某些字母,将其转换为 é™ä± 等符号。这是我的代码:

views.py:

def result(request):
    last_uploaded = OriginalDocument.objects.latest('id')
    original = open(str(last_uploaded.document), 'r')
    original_words = original.read().lower().split()
    words_count = len(original_words)
    open_original = open(str(last_uploaded.document), "r")
    read_original = open_original.read()
    characters_count = len(read_original)
    report_fives = open("static/report_documents/" + str(last_uploaded.student_name) + 
    "-" + str(last_uploaded.document_title) + "-5.txt", 'w', encoding="utf-8")
    # Path to the documents with which original doc is comparing
    path = 'static/other_documents/doc*.txt'
    files = glob.glob(path)
    #endregion

    rows, found_count, fives_count, rounded_percentage_five, percentage_for_chart_five, fives_for_report, founded_docs_for_report = search_by_five(last_uploaded, 5, original_words, report_fives, files)


    context = {
        ...
    }

    return render(request, 'result.html', context)

report txt file:

['universitetindé™', 'té™hsili', 'alä±ram.', 'mé™n'] was found in static/other_documents\doc1.txt.
...

【问题讨论】:

    标签: python django python-3.x django-views python-textprocessing


    【解决方案1】:

    这里的问题是您在没有指定编码的情况下对文件调用open()。如the Python documentation 中所述,默认编码取决于平台。这可能就是您在 Windows 和 MacOS 中看到不同结果的原因。

    假设文件本身实际上是用 UTF-8 编码的,只需在读取文件时指定:

    original = open(str(last_uploaded.document), 'r', encoding="utf-8")
    

    【讨论】:

    • 伙计,将这个属性添加到该行后,结果消失了:(
    • 甚至通过ms word报告doc文件因未知字符错误而无法打开。
    • 我发现了问题!非常感谢!
    猜你喜欢
    • 2011-05-06
    • 1970-01-01
    • 2010-10-30
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-08-08
    相关资源
    最近更新 更多