创建一个数据透视表，其中我的值是我的列的计数答案

【问题标题】：Create a Pivot table where my values are the count of my column创建一个数据透视表，其中我的值是我的列的计数
【发布时间】：2018-08-20 02:08:41
【问题描述】：

我有我的 dtatframe，并且想显示数据透视表中的值只是字符串的计数，这些字符串是我在数据透视表中的列：

我的 df 样本：

trading_book    state
A               Traded Away
B               Dealer Reject
C               Dealer Reject
A               Dealer Reject
B               Dealer Reject
C               Dealer Reject
A               Dealer Reject
D               Dealer Reject
D               Dealer Reject
E               Dealer Reject

期望的结果：

    Traded Away Dealer Reject   Done
Book            
A          1           2          0
B          0           2          0
C          0           2          0
D          0           2          0
E          0           1          0

当我使用以下代码尝试此操作时：

Count_Row = df.shape[0] #gives number of row count
Count_Col = df.shape[1] #gives number of col count
df_Sample = df[['trading_book','state']].head(Count_Row-1)
display(df_Sample)

display(pd.pivot_table(
                   df_Sample, 
                   index=['trading_book'],
                   columns=['state'], 
                   values='state',
                   aggfunc='count'
              ))

我只得到显示的交易书籍

值和 aggfunc 参数需要做什么？

【问题讨论】：

标签： python pandas dataframe pivot-table

【解决方案1】：

您可以在交叉表中使用分类类型列。通过使用类别。您是在告诉 Pandas，即使它没有出现在这个特定的数据集中，也应该将其视为一个选项。

states = 'Traded Away;Dealer Reject;Done'.split(';')
pd.crosstab(df.trading_book, pd.Categorical(df.state, states))

col_0         Traded Away  Dealer Reject  Done
trading_book                                  
A                       1              2     0
B                       0              2     0
C                       0              2     0
D                       0              2     0
E                       0              1     0

【讨论】：

【解决方案2】：

更正您的pivot_table 代码：

v = df.pivot_table(
         index='trading_book', 
         columns='state', 
         aggfunc='size', 
         fill_value=0
)

只要指定aggfunc='size' 参数，就无需指定values 参数。接下来，要获得准确的输出，您需要在列中 reindex 您的数据框：

v.reindex(columns=np.append(df.state.unique(), 'Done'), fill_value=0)

state         Traded Away  Dealer Reject  Done
trading_book                                  
A                       1              2     0
B                       0              2     0
C                       0              2     0
D                       0              2     0
E                       0              1     0

或者，在列表中指定您想要的新列：

cols = ['Done', ...]
v.assign(**dict.fromkeys(cols, 0))

state         Dealer Reject  Traded Away  Done
trading_book                                  
A                         2            1     0
B                         2            0     0
C                         2            0     0
D                         2            0     0
E                         1            0     0

【讨论】：

【解决方案3】：

你可以使用交叉表：

pd.crosstab(df.trading_book,df.state).assign(Done=0)
Out[266]: 
state         Dealer Reject  Traded Away  Done
trading_book                                  
A                         2            1     0
B                         2            0     0
C                         2            0     0
D                         2            0     0
E                         1            0     0

【讨论】：