【问题标题】:CSV Writer only writing first line in fileCSV Writer 仅写入文件的第一行
【发布时间】:2017-08-03 15:14:50
【问题描述】:

因此,我希望将专利数据从 XML 存储到 CSV 文件。我已经能够在发明名称、日期、国家和专利号的每次迭代中运行我的代码,但是当我尝试将结果写入 CSV 文件时出现问题。

XML 数据 看起来像这样(对于许多部分中的一部分):

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE us-patent-grant SYSTEM "us-patent-grant-v42-2006-08-23.dtd" [ ]>
<us-patent-grant lang="EN" dtd-version="v4.2 2006-08-23" file="USD0584026-20090106.XML" status="PRODUCTION" id="us-patent-grant" country="US" date-produced="20081222" date-publ="20090106">
<us-bibliographic-data-grant>
<publication-reference>
<document-id>
<country>US</country>
<doc-number>D0584026</doc-number>
<kind>S1</kind>
<date>20090106</date>
</document-id>
</publication-reference>

我的代码是:

for xml_string in separated_xml(infile): # Calls the output of the separated and read file to parse the data
    soup = BeautifulSoup(xml_string, "lxml")     # BeautifulSoup parses the data strings where the XML is converted to Unicode
    pub_ref = soup.findAll("publication-reference") # Beginning parsing at every instance of a publication
    lst = []  # Creating empty list to append into

    for info in pub_ref:  # Looping over all instances of publication

# The final loop finds every instance of invention name, patent number, date, and country to print and append into

        with open('./output.csv', 'wb') as f:
            writer = csv.writer(f, dialect = 'excel')

            for inv_name, pat_num, date_num, country in zip(soup.findAll("invention-title"), soup.findAll("doc-number"), soup.findAll("date"), soup.findAll("country")):
            #print(inv_name.text, pat_num.text, date_num.text, country.text)
            #lst.append((inv_name.text, pat_num.text, date_num.text, country.text))
                writer.writerow([inv_name.text, pat_num.text, date_num.text, country.text])

最后,我的 .csv 文件中的输出是这样的:

"Content addressable information encapsulation, representation, and transfer",07475432,20090106,US

我不确定问题出在哪里,我知道我还是 Python 的新手,但谁能找到问题所在?

【问题讨论】:

  • open('./output.csv', 'ab'+)

标签: python csv export-to-csv


【解决方案1】:

您在循环中以覆盖模式 ('wb') 打开文件。在每次迭代中,您都会删除之前可能已写入的内容。正确的做法是在循环外打开文件:

...
with open('./output.csv', 'wb') as f:
    writer = csv.writer(f, dialect = 'excel')

    for info in pub_ref:  # Looping over all instances of publication

# The final loop finds every instance of invention name, patent number, date, and country to print and append into



        for inv_name, pat_num, date_num, country in zip(soup.findAll("invention-title"), soup.findAll("doc-number"), soup.findAll("date"), soup.findAll("country")):
            ...

【讨论】:

  • 即使在循环之外更改了这个,我仍然只能得到一行。
【解决方案2】:

问题出在这一行with open('./output.csv', 'wb') as f:

如果要将所有行写入单个文件,请使用模式a。使用wb 将覆盖文件,因此您只会得到最后一行。

在此处阅读有关文件模式的更多信息:https://docs.python.org/2/tutorial/inputoutput.html#reading-and-writing-files

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-02-29
    • 2019-06-10
    • 1970-01-01
    • 2023-01-30
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多