【发布时间】:2018-12-29 22:07:40
【问题描述】:
我有一个这样的数据集,其中每一行代表gameID指定的特定匹配中的一个。
gameID Won/Lost Home Away metric2 metric3 metric4 team1 team2 team3 team4
2017020001 1 1 0 10 10 10 1 0 0 0
2017020001 0 0 1 10 10 10 0 1 0 0
我想做的是创建一个函数,该函数获取具有相同gameID 的行并将它们连接起来。正如您在下面的数据示例中所见,两行代表一场比赛,分为主队 (row_1) 和客队 (row_2)。我希望这两排只坐一排。
Won/Lost h_metric2 h_metric3 h_metric4 a_metric2 a_metric3 a_metric4 h_team1 h_team2 h_team3 h_team4 a_team1 a_team2 a_team3 a_team4
1 10 10 10 10 10 10 1 0 0 0 0 1 0 0
我如何得到这个结果?
编辑:我造成了太多混乱,发布我的代码以便您更好地了解我想要解决的问题。
import numpy as np
import pandas as pd
import requests
import json
from sklearn import preprocessing
from sklearn.preprocessing import OneHotEncoder
results = []
for game_id in range(2017020001, 2017020010, 1):
url = 'https://statsapi.web.nhl.com/api/v1/game/{}/boxscore'.format(game_id)
r = requests.get(url)
game_data = r.json()
for homeaway in ['home','away']:
game_dict = game_data.get('teams').get(homeaway).get('teamStats').get('teamSkaterStats')
game_dict['team'] = game_data.get('teams').get(homeaway).get('team').get('name')
game_dict['homeaway'] = homeaway
game_dict['game_id'] = game_id
results.append(game_dict)
df = pd.DataFrame(results)
df['Won/Lost'] = df.groupby('game_id')['goals'].apply(lambda g: (g == g.max()).map({True: 1, False: 0}))
df["faceOffWinPercentage"] = df["faceOffWinPercentage"].astype('float')
df["powerPlayPercentage"] = df["powerPlayPercentage"].astype('float')
df["team"] = df["team"].astype('category')
df = pd.get_dummies(df, columns=['homeaway'])
df = pd.get_dummies(df, columns=['team'])
【问题讨论】:
-
Won/Lost 列在所需输出中有何意义?
-
抱歉不清楚,赢/输一栏是主队。
标签: python python-3.x pandas dataframe data-structures