【问题标题】:Obtain and store adjacent values from a .csv file (Python)从 .csv 文件中获取并存储相邻值 (Python)
【发布时间】:2013-08-14 11:02:43
【问题描述】:

如果我用有问题的 .csv 进行解释,可能会更容易:

https://www.dropbox.com/s/iswvm4xyjnlhj2w/speciesandbss.csv

以上是白垩纪末期双壳类动物及其采集地点的床层剪切应力的列表。

我正在尝试创建一个发生图,我需要格式化我的数据,以便物种名称在一列中,并具有相应的最低和最高床剪切应力值(在数据集中,有多个同一物种的出现)旁边。

显然,手工操作会非常乏味。

如何创建一个循环以将每个事件附加到单独的列表中,名称是床剪应力对应的物种的名称?然后我可以遍历每个列表以找到最高和最低的。

输入:

eggs 0.1
ham 0.2
ham 0.5
eggs 0.7
eggs 0.3

输出:

eggs = [0.1, 0.7, 0.3]
ham = [0.2, 0.5]

【问题讨论】:

    标签: python csv


    【解决方案1】:

    将值收集到列表字典中; collections.defaultdict() 对象最简单:

    from collections import defaultdict
    import csv
    
    species = defaultdict(list)
    
    with open('speciesandbss.csv', 'rb') as inputfile:
        for row in csv.reader(inputfile):
            species[row[0]].append(row[1])
    
    for name in sorted(species, key=str.lower):
        print '{} = {}'.format(name, species[name])
    

    输出:

    acutata = ['0.16509', '0.16509', '0.16509']
    acutocostata = ['0.03145', '0.01936', '0.01781', '0.01698', '0.01684', '0.01077']
    adkinsi = ['0.16509']
    Aenona = ['0.01311', '0.01311']
    aequilateralis = ['0.00495', '0.00445', '0.00368', '0.00356']
    agdjakendensis = ['0.00628']
    Agerostrea = ['0.01764']
    albertensis = ['0.00852', '0.00356', '0.00495', '0.00461', '0.00445', '0.0041']
    alta = ['0.00328', '0.33148', '0.33148', '0.43129', '0.33148', '0.325', '0.17882', '0.00307']
    alternata = ['0.04929', '0.03373', '0.01311']
    americana = ['0.01497', '0.00436', '0.01497', '0.00495', '0.00461', '0.00445', '0.00105']
    anacachoensis = ['0.05696', '0.05696', '0.05172', '0.03373']
    angulatum = ['0.01179']
    anomala = ['0.00852']
    Anomia = ['0.00852', '0.00506', '0.02955', '0.00786']
    anteradiata = ['0.43129', '0.16509']
    antroea = ['0.01373']
    antrosa = ['0.01103']
    Aphrodina = ['0.43129', '0.01311']
    apressus = ['0.01564']
    Arca = ['0.01179', '0.01311', '0.01311', '0.01311', '0.01224', '0.01224']
    archeri = ['0.16509', '0.16509', '0.16509', '0.16509', '0.16509', '0.16509', '0.16509']
    Arctica = ['0.00203']
    argentaria = ['0.01233', '0.33148', '0.33148', '0.33148', '0.33148', '0.43129', '0.43129', '0.21502', '0.01311', '0.01224', '0.00352', '0.01311', '0.01311', '0.01179', '0.01373', '0.01311', '0.01311', '0.01224', '0.01224', '0.01224', '0.01224', '0.16509', '0.01564']
    armatum = ['0.33148', '0.33148', '0.33148', '0.33148']
    Ascaulocardium = ['0.43129', '0.21502']
    assiniboiensis = ['0.00401', '0.00436', '0.00436', '0.00685', '0.00495', '0.00495', '0.00486', '0.00461', '0.00453', '0.00445']
    assiniboinensis = ['0.00117']
    Astarte = ['0.01497', '0.01311']
    balchii = ['0.00786']
    balticus = ['0.05696', '0.05696', '0.05696', '0.05238', '0.03623', '0.03373', '0.00724', '0.04574']
    barabini = ['0.01233', '0.00852', '0.00506']
    Barbatia = ['0.05696', '0.03373', '0.18121', '0.17882', '0.01224']
    bartoni = ['0.16509']
    bartrami = ['0.325', '0.26095', '0.25697', '0.17882', '0.01311']
    bella = ['0.01311', '0.01311', '0.01311', '0.01311', '0.01311', '0.01311', '0.01311', '0.01311', '0.01764']
    bellisculptus = ['0.05696', '0.05696', '0.03373', '0.25697', '0.01311', '0.01224', '0.01311', '0.01224', '0.01179', '0.01311', '0.01311', '0.01311', '0.01311', '0.01311']
    berryi = ['0.43129', '0.17882']
    biplicata = ['0.05696', '0.03373', '0.33148', '0.33148', '0.43129', '0.01224', '0.16509', '0.16509', '0.16509', '0.16509', '0.16509', '0.16509']
    bisulcata = ['0.01311', '0.01224', '0.01224']
    borealis = ['0.01233', '0.00852', '0.00452', '0.00401', '0.00852', '0.00452', '0.00401', '0.00436', '0.01497', '0.02971']
    bowiei = ['0.16509', '0.16509']
    Breviarca = ['0.01311', '0.01311']
    Brevicardium = ['0.16509']
    brevifrons = ['0.43129']
    bryani = ['0.16509']
    bulbosa = ['0.16509', '0.16509', '0.16509']
    burlingtonensis = ['0.05696', '0.05696', '0.04929', '0.03373', '0.325', '0.01311', '0.01179', '0.04574']
    

    等等

    写出最低和最高值:

    with open('outputfile.csv', 'wb') as outputfile:
        writer = csv.writer(outputfile)
        writer.writerows([n, min(values), max(values)] for n, v in species.iteritems() for values in (map(float, v),))
    

    【讨论】:

    • 他们会在同一个关键字下收集吗?
    • @EJMC:它们都以第一列中使用的名称收集。
    • 这似乎有点跳跃,但我将如何格式化它然后输出一个 .csv:第 1 列中的物种名称,第 2 列中列表中的最小值,列表中的最大值在第 3 列?
    • @EJMC:当然使用csv.writerwriter.writerow([name, min(species[name]), max(species[name]))。 :-)
    • @EJMC:添加了 3 行版本。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2018-05-17
    • 2020-04-20
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多