解析 CSV 和分析数据答案

【问题标题】：Parsing CSV and Analysing Data解析 CSV 和分析数据
【发布时间】：2017-09-05 01:42:40
【问题描述】：

我正在通过 Hacker Rank 做一些功课，但我似乎无法弄清楚为什么它不接受我的答案。

Here is the link to the original repository.

目标是打印其“目标”和“允许的目标”值之间差异最小的团队的名称。

似乎有两种可能性，Leicester 和 Aston_Villa，因为 Leicester 在进球数和允许进球数之间存在负差 (-37)，而 Aston_Villa 的绝对差值最小 (-1)。然而，这些都不被接受。

有什么想法吗？

import sys
import os
import csv

text = '''Team,Games,Wins,Losses,Draws,Goals,Goals Allowed,Points
Arsenal,38,26,9,3,79,36,87
Liverpool,38,24,8,6,67,30,80
Manchester United,38,24,5,9,87,45,77
Newcastle,38,21,8,9,74,52,71
Leeds,38,18,12,8,53,37,66
Chelsea,38,17,13,8,66,38,64
West_Ham,38,15,8,15,48,57,53
Aston_Villa,38,12,14,12,46,47,50
Tottenham,38,14,8,16,49,53,50
Blackburn,38,12,10,16,55,51,46
Southampton,38,12,9,17,46,54,45
Middlesbrough,38,12,9,17,35,47,45
Fulham,38,10,14,14,36,44,44
Charlton,38,10,14,14,38,49,44
Everton,38,11,10,17,45,57,43
Bolton,38,9,13,16,44,62,40
Sunderland,38,10,10,18,29,51,40
Ipswich,38,9,9,20,41,64,36
Derby,38,8,6,24,33,63,30
Leicester,38,5,13,20,30,64,28'''

with open('football.csv', 'w') as f:
    f.write(text)



def read_data(filename):
    """Returns a list of lists representing the rows of the csv file data.

    Arguments: filename is the name of a csv file (as a string)
    Returns: list of lists of strings, where every line is split into a list of values. 
        ex: ['Arsenal', 38, 26, 9, 3, 79, 36, 87]
    """ 
    ifile = open('football.csv', 'rt')
    reader = csv.reader(ifile)

    listed = []
    for row in reader:
        print(row)
        listed.append(row)

    return listed

data = read_data('football.csv')

def get_index_with_min_abs_score_difference(goals):
    net_goals = []

    for i in goals[1:]:
        net_goals.append(int(i[5]) - int(i[6]))

    return net_goals.index(min(net_goals))+1

def get_team(index_value, parsed_data):
    return parsed_data[index_value][0]

footballTable = read_data('football.csv')
minRow = get_index_with_min_abs_score_difference(footballTable)
print(str(get_team(minRow, footballTable)))

我也尝试了替代解决方案（即进球数和允许进球数之间的绝对差值最小的球队）。

def get_index_with_min_abs_score_difference(goals):
    """Returns the index of the team with the smallest difference
    between 'for' and 'against' goals, in terms of absolute value.

    Arguments: parsed_data is a list of lists of cleaned strings
    Returns: integer row index
    """
    net_goals = []

    for i in goals[1:]:
        net_goals.append(abs(int(i[5]) - int(i[6])))

    return net_goals.index(min(net_goals))+1

【问题讨论】：

欢迎来到 StackOverflow！您确实提出了一个措辞非常优美的问题。此外，相关挑战的链接将很好地补充您在此处的文章。

标签： python csv indexing

【解决方案1】：

这并不完全是一个答案，但我有一些关于你的解决方案的 cmets。

您花费大量行逐行读取 csv 文件，只是为了将其放入列表中（稍后您将逐项处理），然后您有一些特殊的逻辑可以跳过标题行。如果您改用 csv.DictReader 并直接使用生成的迭代器而不是先尝试将其读入列表，您的解决方案会简单得多。考虑以下输出：

with open('football.csv', 'rt') as ifile:                                       
    footballTable = csv.DictReader(ifile)                                       
    for row in footballTable:                                                   
        print row

这将向您展示以下内容：

{'Draws': '3', 'Wins': '26', 'Losses': '9', 'Goals Allowed': '36', 'Points': '87', 'Games': '38', 'Goals': '79', 'Team': 'Arsenal'}
{'Draws': '6', 'Wins': '24', 'Losses': '8', 'Goals Allowed': '30', 'Points': '80', 'Games': '38', 'Goals': '67', 'Team': 'Liverpool'}
{'Draws': '9', 'Wins': '24', 'Losses': '5', 'Goals Allowed': '45', 'Points': '77', 'Games': '38', 'Goals': '87', 'Team': 'Manchester United'}
...

你会注意到：

标题行会自动为您处理
您现在可以按名称引用列，而无需在代码中依赖魔术索引 (i[5])。也就是说，您可以要求i['Goals'] 或i['Goals Allowed']。

只需在该循环中添加几行代码，您就可以找到解决问题的方法。

【讨论】：