在熊猫数据框列中找到最大差异答案

【问题标题】：finding greatest differential in pandas dataframe columns在熊猫数据框列中找到最大差异
【发布时间】：2021-09-13 02:48:24
【问题描述】：

我在下面有一个熊猫数据框：

df 

    Ticker       Price     Volume        Price2
0        A  147.779999    51918.0  147.779999
1      AAL   21.209999   229944.0   44.523753
2      AAP  205.139999    32928.0   61.324705
3     AAPL  136.919998  1175723.0  120.954594
4     ABBV  112.599998   135235.0  120.259632
...

我想通过 df 解析并找到 Price2 和 Price 之间差异最大的 Ticker（Price2 减去 Price）。无论选择什么 Ticker，我都希望将行的值存储在一个变量中，以便我可以访问特定的列。

这可能吗？任何帮助将不胜感激！

【问题讨论】：

标签： python python-3.x pandas dataframe

【解决方案1】：

您可以通过idxmax查看

idx = (df['Price2'] - df['Price']).idxmax()
df.loc[idx]

【讨论】：

【解决方案2】：

是的，你可以：

differential = df['Price2'] - df['Price']
ticker = df.loc[differential.idxmax(), 'Ticker']

但是看到您正在处理股票价格，绝对价格差异没有什么意义。 10 美元的差异对 136 美元的股票（如苹果）比对 3400 美元的股票（如亚马逊）的意义更大，这是对 418,000 美元的股票（伯克希尔哈撒韦公司）的舍入误差。更好的衡量标准是使用差异百分比：

differential = df['Price2'] / df['Price'] - 1
ticker = df.loc[differential.idxmax(), 'Ticker']

【讨论】：

【解决方案3】：

s=df.iloc[abs(df['Price2'].sub(df['Price'])).idxmax(),:]#To acess all columns

或

s=df.loc[abs(df['Price2'].sub(df['Price'])).idxmax(),['Price2','Price']] #to access Price2 and Price

【讨论】：

【解决方案4】：

import pandas as pd
import numpy as np

# sample df
df = pd.DataFrame(np.random.default_rng().integers(0, 100, size=(100, 2)), columns=['one', 'two'])

# create column with difference (two - one)
df['dif'] = abs(df['two'] - df['one'])

# get max difference index
print(df['dif'].idxmax()) # can store in a variable as well

【讨论】：