使用 Pandas 使用交易历史来确定股票数量答案

【问题标题】：Using Pandas to use transaction history to determine no of shares使用 Pandas 使用交易历史来确定股票数量
【发布时间】：2020-06-30 09:48:47
【问题描述】：

所以我尝试使用 Football Index 的交易历史记录我的投资组合中的股票数量。

一旦我将 csv 下载到 python 中，在 pandas 中创建了一个数据框并组织了数据，我就有了一个如下所示的数据框：

                   name              type  quantity
723  Alejandro Grimaldo          PURCHASE       100
303  Alejandro Grimaldo         BUY_LIMIT       101
301  Alejandro Grimaldo  BUY_LIMIT_CANCEL       101
721  Alejandro Grimaldo          PURCHASE       100
724  Alejandro Grimaldo          PURCHASE       200
285  Alejandro Grimaldo         BUY_LIMIT       100
276  Alejandro Grimaldo  BUY_LIMIT_CANCEL       100
662         Alex Telles          PURCHASE       200
711      Alexander Isak          PURCHASE       100
747     Alphonso Davies          PURCHASE       100
403            Angelino              SALE        29

我想要一个结果数据框，其中包含“名称”列（没有上述重复项）和“共享数量”列。

股票编号的数学运算方式是：

no of shares = (PURCHASE) * quantity + (BUY_LIMIT)* quantity - SALE * quantity - BUY_LIMIT_CANCEL * quantity

从上面的表格中我想要的输出是：

                   name       no of shares
723  Alejandro Grimaldo                400
662         Alex Telles                200
711      Alexander Isak                100
747     Alphonso Davies                100
403            Angelino                -29
...

如何使用 pandas 创建一个新列，该列给出“数量”中的值的总和，根据“类型”中的内容和“名称”中的每个唯一玩家，添加或减去这些值的总和？

我想知道是否最好创建一个多索引来摆脱玩家名称重复问题，我在正确的轨道上吗？

【问题讨论】：

标签： python pandas dataframe transactions

【解决方案1】：

首先，使用DataFrame.pivot_table 与aggFunc=sum 和fill_value=0 来旋转索引为name 和列为type 的数据框，然后应用公式计算no of shares：

df1 = df.pivot_table(index='name', columns='type',
                     values='quantity', aggfunc='sum', fill_value=0)

df1['no of shares'] = df1['PURCHASE'] + \
    df1['BUY_LIMIT'] - df1['SALE'] - df1['BUY_LIMIT_CANCEL']

df1 = df1['no of shares'].reset_index()

结果：

# print(df1)
                 name  no of shares
0  Alejandro Grimaldo           400
1         Alex Telles           200
2      Alexander Isak           100
3     Alphonso Davies           100
4            Angelino           -29

【讨论】：

【解决方案2】：

另一种解决方案是重新映射数量并执行groupby 和sum

df['quantity'] = np.where(df.type.isin(['SALE', 'BUY_LIMIT_CANCEL']), df.quantity * -1, df.quantity)

print(df.groupby('name').sum())

# prints:

                   quantity
name               
Alejandro Grimaldo      400
Alex Telles             200
Alexander Isak          100
Alphonso Davies         100
Angelino                -29

【讨论】：