【问题标题】:Consolidating two scripts into one new one (appending columns to csv)将两个脚本合并为一个新脚本(将列附加到 csv)
【发布时间】:2013-10-29 22:03:04
【问题描述】:

我有两个脚本在 csv 中创建新列,每个脚本都打开 csv 并附加一个新列。理想情况下,我希望能够一步完成,而不是将 csv 保存到 csv1 然后打开 csv1 并将其重新保存为 csv2。

脚本1

with open("inputcsv1.csv", "r") as input_file:
    header = input_file.readline()[:-1] #this is to remove trailing '\n'
    header += ",Table exists?"
    output_lines = [header]

    for line in input_file:
         output_lines.append(line[:-1])
         if 'table' in line.split(",")[3]:
             output_lines[-1]+=",table exists"
         else:
             output_lines[-1]+=",No table found"

with open("outputcsv1.csv", "w") as output_file:
    output_file.write("\n".join(output_lines))   

脚本2

with open("outputcsv1.csv", "r") as input_file:
    header = input_file.readline()[:-1] #this is to remove trailing '\n'
    header += ",Are you sure Table exists?"
    output_lines = [header]

    for line in input_file:
         output_lines.append(line[:-1])
         if 'table' in line.split(",")[3]:
             output_lines[-1]+=",table definitely exists"
         else:
             output_lines[-1]+=",No table was not found"

with open("outputcsv2.csv", "w") as output_file:
   output_file.write("\n".join(output_lines))   

以上两个脚本是在一个非常简单的示例 csv 中使用的脚本。

示例输入csv1.csv

title1,title2,title3,Table or no table?,title4
data,text,data,the cat sits on the table,text,data
data,text,data,tables are made of wood,text,data
data,text,data,the cat sits on the television,text,data
data,text,data,the dog chewed the table leg,text,data
data,text,data,random string of words,text,data
data,text,data,table seats 25 people,text,data
data,text,data,I have no idea why I made this example about tables,text,data
data,text,data,,text,data

所需的输出 csv:

title1,title2,title3,Table or no table?,title4,Table exists?,Are you sure Table exist
data,text,data,the cat sits on the table,text,data,table exists,table definitely exists
data,text,data,tables are made of wood,text,data,table exists,table definitely exists
data,text,data,the cat sits on the television,text,data,No table found,No table was not found
data,text,data,the dog chewed the table leg,text,data,table exists,table definitely exists
data,text,data,random string of words,text,data,No table found,No table was not found
data,text,data,table seats 25 people,text,data,table exists,table definitely exists
data,text,data,I have no idea why I made this example about tables,text,data,table exists,table definitely exists
data,text,data,,text,data,No table found,No table was not found

为了合并这两个脚本,我尝试了以下代码:

with open("inputcsv1.csv", "r") as input_file:
    header = input_file.readline()[:-1] #this is to remove trailing '\n'
    header2 = input_file.readline()[:-2] #this is to remove trailing '\n'
    header += ",Table exists?"
    header2 += ",Are you sure table exists?"
    output_lines = [header]
    output_lines2 = [header2]

    for line in input_file:
        output_lines.append(line[:-1])
        if 'table' in line.split(",")[3]:
            output_lines[-1]+=",table exists"
        else:
            output_lines[-1]+=",No table found"

    for line in input_file:
        output_lines.append(line[:-2])
        if 'table' in line.split(",")[3]:
            output_lines2[-2]+=",table definitely exists"
        else:
            output_lines2[-2]+=",No table was not found"

with open("TestMurgedOutput.csv", "w") as output_file:
    output_file.write("\n".join(output_lines).join(output_lines2))

它不会产生错误,但它只会在新的 csv 中输出以下内容。

data,text,data,the cat sits on the table,text,dat,Are you sure table exists?

我不知道为什么,尽管我对使用.join 没有信心。 任何有建设性的 cmets 将不胜感激。

【问题讨论】:

    标签: python python-2.7 csv


    【解决方案1】:

    我认为这与您正在寻找的内容很接近——这就是我将两个脚本中的ifstatements 放在一个forloop 中的意思。它可以进行优化,但我尽量保持简单,以便您可以轻松了解正在做什么。

    with open("inputcsv1.csv", "rt") as input_file:
        header = input_file.readline()[:-1]  # remove trailing newline
        # add a title to the header for each of the two new columns
        header += ",Table exists?,Are you sure table exists?"
        output_lines = [header]
    
        for line in input_file:
            line = line[:-1]  # remove trailing newline
            cols = line.split(',')  # split line in columns based on delimiter
            # add first column
            if 'table' in cols[3]:
                line += ",table exists"
            else:
                line += ",No table found"
            # add second column
            if 'table' in cols[3]:
                line += ",table definitely exists"
            else:
                line += ",No table was not found"
            output_lines.append(line)
    
    with open("TestMurgedOutput.csv", "wt") as output_file:
        output_file.write("\n".join(output_lines))
    

    创建的TestMurgedOutput.csv 文件的内容:

    title1,title2,title3,Table or no table?,title4,Table exists?,Are you sure table exists?
    data,text,data,the cat sits on the table,text,data,table exists,table definitely exists
    data,text,data,tables are made of wood,text,data,table exists,table definitely exists
    data,text,data,the cat sits on the television,text,data,No table found,No table was not found
    data,text,data,the dog chewed the table leg,text,data,table exists,table definitely exists
    data,text,data,random string of words,text,data,No table found,No table was not found
    data,text,data,table seats 25 people,text,data,table exists,table definitely exists
    data,text,data,I have no idea why I made this example about tables,text,data,table exists,table definitely exists
    data,text,data,,text,data,No table found,No table was not found
    

    【讨论】:

    • 在我之前发布之前,我尝试了一些与此非常相似的方法,但我认为我一定离题太远了。似乎太简单了。现在将对其进行测试:) 感谢您的回答:)
    • 我遗漏的一个优化是,由于两个ifs 似乎检查相同的条件,实际上只需要其中一个(以及相应的else)来组合分两步完成的结果。
    • 不是很好,Martineau,我的真实脚本每列有大约 30 个不同的 if 语句,提供了许多不同的输出。正如我在下面的另一个答案中提到的那样,通过提供一个过于简化的例子是我的错:)
    【解决方案2】:

    您的 output_lines2 列表仅包含一个元素(因为文件中的所有行 在第一个 for 循环中读取),因此 join 对其没有影响,并且 write 语句输出 output_lines2 列表的单个元素。试试这个:

    with open("test.csv", "r") as input_file:
    header = input_file.readline()[:-1] #this is to remove trailing '\n'
    header += ",Table exists?"
    header += ",Are you sure Table exists?"
    output_lines = [header]
    for line in input_file:
         output_lines.append(line[:-1])
         if 'table' in line.split(",")[3]:
                output_lines[-1]+=",table exists"
         else:
                output_lines[-1]+=",No table found"
         if 'table' in line.split(",")[3]:
                output_lines[-1]+=",table definitely exists"
         else:
                output_lines[-1]+=",No table was not found"
    with open("output.csv", "w") as output_file:
    output_file.write("\n".join(output_lines))
    

    【讨论】:

    • 这是我因为让示例过于简单而自责的地方。实际上,两个新列都不是由于一个过滤器而添加的。将数据附加到每一列的标准是完全不同的。 (当问题有一个被接受的答案时,我会给你一个答案的投票)
    • 然后在一个for循环中依次使用两个过滤器
    • je ne comprends pas ?
    • @GTPE: Peut-être que c'est à cause de votre handicap intellectuel。
    • @martineau 哈哈,我会加倍服用抗精神病药,看看是否有帮助
    猜你喜欢
    • 2013-03-13
    • 1970-01-01
    • 2017-08-04
    • 1970-01-01
    • 2014-07-18
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2014-08-17
    相关资源
    最近更新 更多