如果同一行中另一列中的值匹配，如何比较列的两个值答案

【问题标题】：How to compare two values of a column if the values in another column in the same rows match如果同一行中另一列中的值匹配，如何比较列的两个值
【发布时间】：2019-11-22 13:37:06
【问题描述】：

我有一个数据框，我想查看具有较高守卫技能的团队的获胜百分比（0 = 失败；1 = 获胜）。

  matchid   team    win     wardskilled
0   10        1     0.0        8.0
1   10        2     1.0       10.0
2   11        1     0.0        8.0
3   11        2     1.0        8.0
4   12        1     0.0        2.0
5   12        2     1.0        5.0
6   13        1     0.0        5.0
7   13        2     1.0        5.0
8   14        1     0.0        1.0
9   14        2     1.0        1.0
10  15        1     1.0        3.0
11  15        2     0.0        1.0
..  ..        ..     ..         ..
..  ..        ..     ..         ..
..  ..        ..     ..         ..

因为我是 python 的新手，我完全不知道如何开始

我很想创造类似的东西：

       Teams with more wardskilled       Teams with less wardskilled

win              %                                   %

lose             %                                   %

我将不胜感激任何形式的帮助

【问题讨论】：

标签： python pandas jupyter-notebook data-analysis

【解决方案1】：

另一种方法是将一个团队的wardskilled 与两个团队的平均值进行比较：

means = df.groupby('matchid') .wardskilled.transform('mean')
df['more_skilled'] = np.sign(df.wardskilled.sub(means))

(df.groupby('win')
   .more_skilled
   .value_counts(normalize=True)
   .unstack('more_skilled', fill_value=0)
)

输出

more_skilled  -1.0   0.0   1.0
win                           
0.0            0.5   0.5   0.0
1.0            0.0   0.5   0.5

【讨论】：

more_skilled 行是什么意思？ -1 = 更少，0 = 平局，1 = 更多？
是的，1 表示差值df.wardskilled.sub(means) 是正数，所以团队更熟练。

【解决方案2】：

`rank`

如果所有 'matchid' 有 2 个团队，您可以使用它来确定该团队是否有更高、更低或并列的 'wardskilled'。按此分组并计算平均获胜率。

s = df.groupby('matchid').wardskilled.rank().map({1: 'Less', 1.5: 'Tied', 2: 'More'})
df.groupby(s).win.mean()

#wardskilled
#More    1.0
#Less    0.0
#Tied    0.5
#Name: win, dtype: float64

拥有两列是多余的，但如果必须：

res = df.groupby(s).win.mean().to_frame('win_per')
res['loss_per'] = 1-res['win_per']

#             win_per  loss_per
#wardskilled                   
#More             1.0       0.0
#Less             0.0       1.0
#Tied             0.5       0.5

【讨论】：

感谢您的回答，您说得对，第二列是多余的。