从一个数据帧创建几个新的数据帧或字典答案

【问题标题】：Create several new dataframes or dictionaries from one dataframe从一个数据帧创建几个新的数据帧或字典
【发布时间】：2019-02-16 22:36:04
【问题描述】：

我有一个这样的数据框：

evt    pcle    bin_0    bin_1    bin_2    ...    bin_49
 1      pi      1        0         0               0 
 1      pi      1        0         0               0 
 1      k       0        0         0               1 
 1      pi      0        0         1               0 
 2      pi      0        0         1               0 
 2      k       0        1         0               0 
 3      J       0        1         0               0 
 3      pi      0        0         0               1 
 3      pi      1        0         0               0 
 3      k       0        1         0               0 
 ...
 5000   J       0        0         1               0 
 5000   pi      0        1         0               0 
 5000   k       0        0         0               1

有了这些信息，我想创建几个其他数据框 df_{evt}（或者字典应该更好？）：

df_1 : 
pcle    cant    bin_0    bin_1    bin_2   ...    bin_49        
 pi      3        2        0        1              0
  k      1        0        0        0              1

df_2 : 
pcle    cant    bin_0    bin_1    bin_2   ...    bin_49        
 pi      1        0        0        1              0
  k      0        1        0        0              0

总共将有 5000 个数据帧（每个 evt 1 个），其中每个数据帧：

*the column "cant" has the ocurrences of "pcle" in the particular "evt". 

*bin_0 ... bin_49 have the sum of the values for this particular "pcle" in 
 the particular "evt".

实现这一目标的最佳方法是什么？

【问题讨论】：

查看数据框'groupby'。您想要的是对“evt”进行分组并对每个结果数据帧执行您想要的任何检查或计算：split-appy-combine

标签： python pandas dataframe

【解决方案1】：

这是一个可能的解决方案：

import pandas as pd
import numpy as np
columns = ["evt", "pcle", "bin_0", "bin_1", "bin_2", "bin_3"]
data = [[1, "pi", 1, 0, 0, 0],
        [1, "pi", 0, 0, 0, 0],
        [1, "k", 0, 0, 0, 1],
        [1, "pi", 0, 0, 1, 0],
        [2, "pi", 0, 0, 1, 0],
        [2, "k", 0, 1, 0, 0],
        [3, "J", 0, 1, 0, 0],
        [3, "pi", 0, 0, 0, 1],
        [3, "pi", 1, 0, 0, 0],
        [3, "k", 0, 1, 0, 0]]

df = pd.DataFrame(data=data, columns=columns)

# group your data by the columns you want
grouped = df.groupby(["evt", "pcle"])

# compute the aggregates for the bin_X
df_t = grouped.aggregate(np.sum)

# move pcle from index to column
df_t.reset_index(level=["pcle"], inplace=True)

# count occurrences of pcle
df_t["cant"] = grouped.size().values

# filter evt with .loc
df_t.loc[1]

如果你想把它变成字典，那么你可以运行：

d = {i:j.reset_index(drop=True) for i, j in df_t.groupby(df_t.index)}

【讨论】：