【问题标题】:Loops using Numpy Vectorization使用 Numpy 向量化的循环
【发布时间】:2020-04-09 23:04:19
【问题描述】:

我正在尝试复制使用循环和 numpy 矢量化的结果,文章在此处找到 (https://towardsdatascience.com/how-to-make-your-pandas-loop-71-803-times-faster-805030df4f06)。这篇文章不包括运行的数据或结果,但我能够在网上找到数据。我想为自己的工作复制结果,但输出不正确。

我已经包含了文章中的一小部分原始数据框和相应的代码:

import pandas as pd 
data = {'HomeTeam':['Burnley','Crystal Palace','Everton','Hull','Man City','Middlesbrough','Southampton',
'Arsenal','Bournemouth','Chelsea','Man United','Burnley','Leicester','Stoke'], 'AwayTeam':['Swansea','West Brom','Tottenham','Leicester','Sunderland','Stoke','Watford','Liverpool','Man United',
'West Ham','Southampton','Liverpool','Arsenal','Man City'], 'FTR': ['A','A','D','H','H','D','D','A','A','H','H','H','D','A']} 

leaguedf = pd.DataFrame(data) 

def soc_iter(TEAM,home,away,ftr):
    leaguedf['Draws'] = 'No_Game'
    leaguedf.loc[((home == TEAM) & (ftr == 'D')) | ((away == TEAM) & (ftr == 'D')), 'Draws'] = 'Draw'
    leaguedf.loc[((home == TEAM) & (ftr != 'D')) | ((away == TEAM) & (ftr != 'D')), 'Draws'] = 'No_Draw'

leaguedf['Draws']=soc_iter('Arsenal',leaguedf['HomeTeam'].values, leaguedf['AwayTeam'].values, leaguedf['FTR'].values)
leaguedf

当我运行代码时,输​​出列“​​Draws”只生成“None”输出,而不是“Draw”或“No_Draw”。

代码有什么问题?

【问题讨论】:

    标签: numpy vectorization


    【解决方案1】:

    您的函数不返回任何内容,因此 Draws 将全部为 None,您不必分配任何内容,函数内的代码已经在创建 Draw 列:

    import pandas as pd 
    data = {'HomeTeam':['Burnley','Crystal Palace','Everton','Hull','Man City','Middlesbrough','Southampton',
    'Arsenal','Bournemouth','Chelsea','Man United','Burnley','Leicester','Stoke'], 'AwayTeam':['Swansea','West Brom','Tottenham','Leicester','Sunderland','Stoke','Watford','Liverpool','Man United',
    'West Ham','Southampton','Liverpool','Arsenal','Man City'], 'FTR': ['A','A','D','H','H','D','D','A','A','H','H','H','D','A']} 
    
    leaguedf = pd.DataFrame(data) 
    
    def soc_iter(TEAM,home,away,ftr):
        leaguedf['Draws'] = 'No_Game'
        leaguedf.loc[((home == TEAM) & (ftr == 'D')) | ((away == TEAM) & (ftr == 'D')), 'Draws'] = 'Draw'
        leaguedf.loc[((home == TEAM) & (ftr != 'D')) | ((away == TEAM) & (ftr != 'D')), 'Draws'] = 'No_Draw'
    
    soc_iter('Arsenal',leaguedf['HomeTeam'].values, leaguedf['AwayTeam'].values, leaguedf['FTR'].values)
    leaguedf
    
               HomeTeam     AwayTeam FTR    Draws
    0          Burnley      Swansea   A  No_Game
    1   Crystal Palace    West Brom   A  No_Game
    2          Everton    Tottenham   D  No_Game
    3             Hull    Leicester   H  No_Game
    4         Man City   Sunderland   H  No_Game
    5    Middlesbrough        Stoke   D  No_Game
    6      Southampton      Watford   D  No_Game
    7          Arsenal    Liverpool   A  No_Draw
    8      Bournemouth   Man United   A  No_Game
    9          Chelsea     West Ham   H  No_Game
    10      Man United  Southampton   H  No_Game
    11         Burnley    Liverpool   H  No_Game
    12       Leicester      Arsenal   D     Draw
    13           Stoke     Man City   A  No_Game
    

    【讨论】:

    • 完美。这行得通。示例中两次错误地引用了“绘制”列。感谢修复!
    猜你喜欢
    • 2018-08-26
    • 2017-04-29
    • 2013-07-21
    • 2013-11-19
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-10-20
    • 1970-01-01
    相关资源
    最近更新 更多