【问题标题】:Concatenate column values in csv based on grouping condition根据分组条件连接csv中的列值
【发布时间】:2021-01-17 08:53:45
【问题描述】:

我有一个如下所示的 csv(注意:Name 列中的值不受限制,即不仅是 ABCDEF):

Name, Type, Text 
ABC, Type A, how
ABC, Type A, are
ABC, Type A, you
ABC, Type B, Your
ABC, Type B, Name?
DEF, Type A, I
DEF, Type A, am
DEF, Type A, good
DEF, Type B, I'm
DEF, Type B, Terminator
... and more 

我想创建另一个 csv 文件,如下所示(即,基于每个 Type 列的 Text 列对每个 Name 列进行分组):

Name, Type, Text
ABC, Type A, how are you
ABC, Type B, Your Name?
DEF, Type A, I am good
DEF, Type B, I'm Terminator
..till the end

我正在尝试编写一个 python 脚本。我的尝试如下:

TypeList = ['Type A','Type B']
with open("../doc1.csv", encoding='utf-8', newline='', mode="r") as myfile:
    
    g = csv.reader(myfile)

    with open("../doc2.csv", encoding='utf-8', newline='', mode="w") as myfile:
        h = csv.writer(myfile)
        h.writerow(["Name","Text"])

        for row in g:
            if TypeList[0] in row[1]:    
               Concatenatedtext[0]= Concatenatedtext[0] + ' ' + row[1]

有人可以帮我解决这个问题吗?

【问题讨论】:

    标签: python csv concatenation string-concatenation text-manipulation


    【解决方案1】:

    将 csv 行组合在一起是 itertools.groupby 函数的任务。

    itertools.groupby 接受一个定义匹配行的键函数,并为找到的每个匹配项发出键(此处为名称和类型)和组(匹配的行)。

    operator.itemgetter 函数可用于创建键函数。

    import csv
    import itertools
    import operator
    
    # A function that gets the Name and Type values for each row:
    # this is used to group the rows together.
    key_func = operator.itemgetter(0, 1)
    
    with open('myfile.csv', newline='') as f:
        reader = csv.reader(f)
        # Skip header row
        next(reader)
        for key, group in itertools.groupby(reader, key=key_func):
            text = ' '.join(cell[2] for cell in group)
            print([key[0], key[1], text])
    

    输出:

    ['ABC', ' Type A', ' how  are  you']
    ['ABC', ' Type B', ' Your  Name?']
    ['DEF', ' Type A', ' I  am  good']
    ['DEF', ' Type B', " I'm  Terminator"]
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2015-11-19
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多