【发布时间】:2020-01-24 16:25:13
【问题描述】:
我有一个包含 5 列的数据集,请原谅格式:
id Price Service Rater Name Cleanliness
401013357 5 3 A 1
401014972 2 1 A 5
401022510 3 4 B 2
401022510 5 1 C 9
401022510 3 1 D 4
401022510 2 2 E 2
我希望每个 ID 只有一行。因此,我需要为每个评分者的姓名和评分类别(例如评分者姓名价格、评分者姓名服务、评分者姓名清洁度)创建列,每一个都在自己的列中。谢谢。
我已经探索了 groupby,但不知道如何将它们操作到新列中。谢谢!
Here's the code and data I'm actually using:
import requests
from pandas import DataFrame
import pandas as pd
linesinfo_url = 'https://api.collegefootballdata.com/lines?year=2018&seasonType=regular'
linesresp = requests.get(linesinfo_url)
dflines = DataFrame(linesresp.json())
#nesteddata in lines like game info
#setting game ID as index
dflines.set_index('id', inplace=True)
a = linesresp.json()
#defining a as the response to our get request for this data, in JSON format
buf = []
#i believe this creates a receptacle for nested data I'm extracting from json
for game in a:
for line in game['lines']:
game_dict = dict(id=game['id'])
for cat in ('provider', 'spread','formattedSpread', 'overUnder'):
game_dict[cat] = line[cat]
buf.append(game_dict)
dflinestable = pd.DataFrame(buf)
dflinestable.set_index(['id', 'provider'])
从这里,我得到
formattedSpread overUnder spread
id provider
401013357 consensus UMass -21 68.0 -21.0
401014972 consensus Rice -22.5 58.5 -22.5
401022510 Caesars Colorado State -17.5 57.5 -17.5
consensus Colorado State -17 57.5 -17.0
numberfire Colorado State -17 58.5 -17.0
teamrankings Colorado State -17 58.0 -17.0
401013437 numberfire Wyoming -5 47.0 5.0
teamrankings Wyoming -5 47.0 5.0
401020671 consensus Ball State -19.5 61.5 -19.5
401019470 Caesars UCF -22.5 NaN 22.5
consensus UCF -22.5 NaN 22.5
numberfire UCF -24 70.0 24.0
teamrankings UCF -24 70.0 24.0
401013328 numberfire Minnesota -21.5 47.0 -21.5
teamrankings Minnesota -21.5 49.0 -21.5
我正在寻找的结果是 4 个不同的提供者中的每一个都有三列,因此它是 caesars_formattedSpread、caesars_overUnder、Caesars spread、numberfire_formattedSpread、numberfire_overUnder、numberfire_spread 等。
当我按照建议运行 unstack 时,我没有得到我期望的结果。相反,我得到:
formattedSpread 0 UMass -21
1 Rice -22.5
2 Colorado State -17.5
3 Colorado State -17
4 Colorado State -17
5 Colorado State -17
6 Wyoming -5
7 Wyoming -5
8 Ball State -19.5
9 UCF -22.5
10 UCF -22.5
11 UCF -24
12 UCF -24
【问题讨论】:
-
你的预期输出是什么?
-
您尝试过什么,预期的结果是什么?能否请您提供更多信息!
-
@WeNYoBen - 查看编辑。
-
@SimonFink 如上所述,我进行了重大修改。可能是试图过于简单化。
-
聚会迟到了,但预期的输出是多少?我只看到“不正确”的输出。
标签: python pandas group-by pandas-groupby