【问题标题】:How to abstract over two similar functions如何抽象两个相似的功能
【发布时间】:2020-06-08 23:56:12
【问题描述】:

我有以下关于足球比赛的数据定义:

Game = namedtuple('Game', ['Date', 'Home', 'Away', 'HomeShots', 'AwayShots',
                           'HomeBT', 'AwayBT', 'HomeCrosses', 'AwayCrosses',
                           'HomeCorners', 'AwayCorners', 'HomeGoals',
                           'AwayGoals', 'HomeXG', 'AwayXG'])

这里有一些例子:

[Game(Date=datetime.date(2018, 10, 21), Home='Everton', Away='Crystal Palace', HomeShots='21', AwayShots='6', HomeBT='22', AwayBT='13', HomeCrosses='21', AwayCrosses='14', HomeCorners='10', AwayCorners='5', HomeGoals='2', AwayGoals='0', HomeXG='1.93', AwayXG='1.5'),
 Game(Date=datetime.date(2019, 2, 27), Home='Man City', Away='West Ham', HomeShots='20', AwayShots='2', HomeBT='51', AwayBT='6', HomeCrosses='34', AwayCrosses='5', HomeCorners='12', AwayCorners='2', HomeGoals='1', AwayGoals='0', HomeXG='3.68', AwayXG='0.4'),
 Game(Date=datetime.date(2019, 2, 9), Home='Fulham', Away='Man Utd', HomeShots='12', AwayShots='15', HomeBT='19', AwayBT='38', HomeCrosses='20', AwayCrosses='12', HomeCorners='5', AwayCorners='4', HomeGoals='0', AwayGoals='3', HomeXG='2.19', AwayXG='2.13'),
 Game(Date=datetime.date(2019, 3, 9), Home='Southampton', Away='Tottenham', HomeShots='12', AwayShots='15', HomeBT='13', AwayBT='17', HomeCrosses='15', AwayCrosses='15', HomeCorners='1', AwayCorners='10', HomeGoals='2', AwayGoals='1', HomeXG='2.08', AwayXG='1.27'),
 Game(Date=datetime.date(2018, 9, 22), Home='Man Utd', Away='Wolverhampton', HomeShots='16', AwayShots='11', HomeBT='17', AwayBT='17', HomeCrosses='26', AwayCrosses='13', HomeCorners='5', AwayCorners='4', HomeGoals='1', AwayGoals='1', HomeXG='0.62', AwayXG='1.12')]

还有两个几乎相同的函数计算给定球队的主客场统计数据。

def calculate_home_stats(team, games):
    """
    Calculates home stats for the given team.
    """
    home_stats = defaultdict(float)

    home_stats['HomeShotsFor'] = sum(int(game.HomeShots) for game in games if game.Home == team)
    home_stats['HomeShotsAgainst'] = sum(int(game.AwayShots) for game in games if game.Home == team)
    home_stats['HomeBoxTouchesFor'] = sum(int(game.HomeBT) for game in games if game.Home == team)
    home_stats['HomeBoxTouchesAgainst'] = sum(int(game.AwayBT) for game in games if game.Home == team)
    home_stats['HomeCrossesFor'] = sum(int(game.HomeCrosses) for game in games if game.Home == team)
    home_stats['HomeCrossesAgainst'] = sum(int(game.AwayCrosses) for game in games if game.Home == team)
    home_stats['HomeCornersFor'] = sum(int(game.HomeCorners) for game in games if game.Home == team)
    home_stats['HomeCornersAgainst'] = sum(int(game.AwayCorners) for game in games if game.Home == team)
    home_stats['HomeGoalsFor'] = sum(int(game.HomeGoals) for game in games if game.Home == team)
    home_stats['HomeGoalsAgainst'] = sum(int(game.AwayGoals) for game in games if game.Home == team)
    home_stats['HomeXGoalsFor'] = sum(float(game.HomeXG) for game in games if game.Home == team)
    home_stats['HomeXGoalsAgainst'] = sum(float(game.AwayXG) for game in games if game.Home == team)
    home_stats['HomeGames'] = sum(1 for game in games if game.Home == team)

    return home_stats


def calculate_away_stats(team, games):
    """
    Calculates away stats for the given team.
    """
    away_stats = defaultdict(float)

    away_stats['AwayShotsFor'] = sum(int(game.AwayShots) for game in games if game.Away == team)
    away_stats['AwayShotsAgainst'] = sum(int(game.HomeShots) for game in games if game.Away == team)
    away_stats['AwayBoxTouchesFor'] = sum(int(game.AwayBT) for game in games if game.Away == team)
    away_stats['AwayBoxTouchesAgainst'] = sum(int(game.HomeBT) for game in games if game.Away == team)
    away_stats['AwayCrossesFor'] = sum(int(game.AwayCrosses) for game in games if game.Away == team)
    away_stats['AwayCrossesAgainst'] = sum(int(game.HomeCrosses) for game in games if game.Away == team)
    away_stats['AwayCornersFor'] = sum(int(game.AwayCorners) for game in games if game.Away == team)
    away_stats['AwayCornersAgainst'] = sum(int(game.HomeCorners) for game in games if game.Away == team)
    away_stats['AwayGoalsFor'] = sum(int(game.AwayGoals) for game in games if game.Away == team)
    away_stats['AwayGoalsAgainst'] = sum(int(game.HomeGoals) for game in games if game.Away == team)
    away_stats['AwayXGoalsFor'] = sum(float(game.AwayXG) for game in games if game.Away == team)
    away_stats['AwayXGoalsAgainst'] = sum(float(game.HomeXG) for game in games if game.Away == team)
    away_stats['AwayGames'] = sum(1 for game in games if game.Away == team)

    return away_stats

我想知道是否有一种方法可以对这两个函数进行抽象并将它们合并为一个,而无需创建一堵 if/else 语句来确定球队是在家还是在客场比赛以及应该计算哪些字段.

【问题讨论】:

  • 我认为问题出在你的数据结构上。设计它可能是一个好主意,因此这种抽象变得微不足道。例如,客场/主场对其余数据(进球/射门/等)没有任何影响
  • 那么您能否提供一个替代数据定义来帮助进行所需的抽象?
  • 当然,我必须定期回答这个问题,但评论有点长。
  • 整理您的数据,以便更统一地访问主场和客场统计数据。例如,嵌套了 game.HomeStatsgame.AwayStats 数据结构,它们以相同的格式存储主场和客场统计数据,而不是使用两组单独的属性。
  • 期待看到它!

标签: python python-3.x dry abstraction


【解决方案1】:

拥有更简洁的数据结构可以编写更简单的代码。 在这种情况下,您的数据已经包含重复 (例如,您同时拥有HomeShotsAwayShots)。

对于如何在此处构建数据有很多可能的答案。 我将讨论一个不会改变太多的解决方案 你原来的结构。

Statistics = namedtuple('Statistics', ['shots', 'BT', 'crosses', 'corners', 'goals', 'XG'])
Game = namedtuple('Game', ['home', 'away', 'date', 'home_stats', 'away_stats'])

你可以这样使用(这里我没有包括所有的统计数据,只是举几个例子):

def calculate_stats(games, team_name, home_stats_only=False, away_stats_only=False):

    home_stats = [g.home_stats._asdict() for g in games if g.home == team_name]
    away_stats = [g.away_stats._asdict() for g in games if g.away == team_name]

    if away_stats_only:
        input_stats = away_stats
    elif home_stats_only:
        input_stats = home_stats
    else:
        input_stats = home_stats + away_stats

    def sum_on_field(field_name):
        return sum(stats[field_name] for stats in input_stats)

    return {f:sum_on_field(f) for f in Statistics._fields}

然后可以用来获取客场/主场数据:

example_game_1 = Game(
    home='Burnley', 
    away='Arsenal',
    date=datetime.now(),
    home_stats=Statistics(shots=12, BT=26, crosses=21, corners=4, goals=1, XG=1.73),
    away_stats=Statistics(shots=17, BT=26, crosses=22, corners=5, goals=3, XG=2.87),
)

example_game_2 = Game(
    home='Arsenal',
    away='Pessac',
    date=datetime.now(),
    home_stats=Statistics(shots=1, BT=1, crosses=1, corners=1, goals=1, XG=1),
    away_stats=Statistics(shots=2, BT=2, crosses=2, corners=2, goals=2, XG=2),
)

print(calculate_stats([example_game_1, example_game_2], 'Arsenal'))
print(calculate_stats([example_game_1, example_game_2], 'Arsenal', home_stats_only=True))
print(calculate_stats([example_game_1, example_game_2], 'Arsenal', away_stats_only=True))

哪些打印:

{'shots': 18, 'BT': 27, 'crosses': 23, 'corners': 6, 'goals': 4, 'XG': 3.87}
{'shots': 1, 'BT': 1, 'crosses': 1, 'corners': 1, 'goals': 1, 'XG': 1}
{'shots': 17, 'BT': 26, 'crosses': 22, 'corners': 5, 'goals': 3, 'XG': 2.87}

在处理此类数据时,通常最好使用专用工具,例如pandas。使用互动工具也很方便,比如JupyterLab

【讨论】:

  • 我很确定主客场数据是针对一场比赛中的两支不同球队,而不是针对两场单独的比赛。
  • 啊,确实有道理。
  • @user2357112 支持 Monica this! 给定 20 支球队的常规冠军,每支球队有 19 场主场比赛和 19 场客场比赛,我的函数会为他们计算摘要。我的程序中没有Match 这样的东西。
  • 我盯着你的代码试图弄清楚发生了什么:) 看起来你把它作为淘汰赛而不是常规锦标赛的总结。
  • @cglacet 我在代码中添加了更多示例以使其更清晰。
【解决方案2】:

我建议不要使用命名元组,而是使用带有字典的简单元组,例如:

game=(datetime.date(2019, 5, 12), 'Burnley', 'Arsenal', '12', '17', '26', '26', '21', '22', '4', '5', '1', '3', '1.73', '2.87')

还有一个映射字典:

numtostr={0: 'Date', 1: 'Home', 2: 'Away', 3: 'HomeShots', 4: 'AwayShots', 5: 'HomeBT', 6: 'AwayBT', 7: 'HomeCrosses', 8: 'AwayCrosses', 9: 'HomeCorners', 10: 'AwayCorners', 11: 'HomeGoals', 12: 'AwayGoals', 13: 'HomeXG'}
strtonum={'Date': 0, 'Home': 1, 'Away': 2, 'HomeShots': 3, 'AwayShots': 4, 'HomeBT': 5, 'AwayBT': 6, 'HomeCrosses': 7, 'AwayCrosses': 8, 'HomeCorners': 9, 'AwayCorners': 10, 'HomeGoals': 11, 'AwayGoals': 12, 'HomeXG': 13}

为 homestats 和 awaystats 制作映射字典({0: 'HomeShotsFor', 1: 'HomeShotsAgainst' etc} for home_stats)。为了解释映射字典是如何工作的,例如,如果你想获得一个游戏的 HomeCrosses,你可以有

game[7]

game[strtonum['HomeCrosses']]

然后是函数:

def calculate_home_stats(team, games):
    home_stats=[0]*13
    for game in games:
        if game[1]=team:
            for index in range(12):
                home_stats[index]+=game[index+3] #because you just put the sum of everything except date, home, and away which are the first 3 indices. see how this cleans everything up?
            home_stats[12]+=1

def calculate_away_stats(team, games):
    away_stats=[0]*13
    for game in games:
        if game[2]=team:
            for index in range(12):
                away_stats[index]+=game[index+3]
            away_stats[12]+=1

如果你真的想将这两个函数合并为一个,你可以这样做:

def calculate_stats(team, games, homeaway):
    stats=[0]*13
    for game in games:
        if game[{'Home': 1, 'Away': 2}[homeaway]]=team:
            for index in range(12):
                stats[index]+=game[index+3]
            stats[12]+=1

与我的函数一样,您唯一需要更改的是检查 home 或 away 的索引,而不是需要大量更改的冗余 if else 语句。

【讨论】:

  • 老实说不知道有什么变化。我们仍然有两个几乎相同的函数,它们看起来可读性要差得多。也不清楚使用namedtuple有什么问题。
  • 我编辑了答案以更好地回答问题。我使函数不那么冗长,因此更容易更改,从而允许合并两个函数而无需更改所有 if else 语句。我没有使用 namedtuple,因为这意味着我不能真正用它们做 stats[index]+=game[index+3],这构成了我们非常不想要的所有 if else 语句。
猜你喜欢
  • 2023-01-12
  • 1970-01-01
  • 1970-01-01
  • 2016-10-06
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2021-02-07
  • 2013-01-24
相关资源
最近更新 更多