python比较表中的字符串并返回最佳字符串答案

【问题标题】：python compare strings in a table and return the best stringpython比较表中的字符串并返回最佳字符串
【发布时间】：2015-03-11 05:44:55
【问题描述】：

我有一个由空格分隔的 3 列的表

A1 3445 1  24
A1 3445 1 214
A2 3603 2  45
A2 3603 2 144
A0 3314 3   8
A0 3314 3 134
A0 3314 4  46

我想将最后一列与第一列中的 ID（例如 A1）进行比较，以返回数字最大的字符串。所以，最终的结果会是这样的。

A1 3445 1 214
A2 3603 2 144
A0 3314 3 134

我已经完成了分割线，但我不知道如何比较线。一个帮助会很好。

【问题讨论】：

什么是“数据表”？
我将“数据表”固定为一个表。
是 pandas 数据框、csv 文件、嵌套列表...？
只是一个空格分隔的文件。

标签： python compare

【解决方案1】：

使用sorted 函数，将最后一列作为键

with open('a.txt', 'r') as a:  # 'a.txt' is your file
    table = []
    for line in a:
        table.append(line.split())

s = sorted(table, key=lambda x:int(x[-1]), reverse=True)
for r in s:
    print '\t'.join(r)

结果：

A1  3445    1   214
A2  3603    2   144
A0  3314    3   134
A0  3314    4   46
A2  3603    2   45
A1  3445    1   24
A0  3314    3   8

【讨论】：

谢谢，但由于不是排序，我不需要最后四行。
你可以重新定义s:s = sorted(table, key=lambda x:int(x[-1]), reverse=True)[:3]

【解决方案2】：

dataDic = {}
for data in open('1.txt').readlines():
    id, a, b ,num = data.split(" ")
    if not dataDic.has_key(id):
        dataDic[id] = [a, b, int(num)]
    else:
       if int(num) >= dataDic[id][-1]:
           dataDic[id] = [a, b, int(num)]

print dataDic

我想，也许这个结果就是你想要的。

【讨论】：

【解决方案3】：

data = [('A1',3445,1,24),  ('A1',3445,1,214), ('A2',3603,2,45),
        ('A2',3603,2,144), ('A0',3314,3,8),   ('A0',3314,3,134), 
        ('A0',3314,4, 46)]

from itertools import groupby
for  key, group in groupby(data, lambda x: x[0]):
    print sorted(group, key=lambda x: x[-1], reverse=True)[0]

输出是：

('A1', 3445, 1, 214)
('A2', 3603, 2, 144)
('A0', 3314, 3, 134)

您可以使用此功能groupby。

【讨论】：