【问题标题】:How to take the mean over a list of dataframes using pandas.Panel?如何使用 pandas.Panel 获取数据框列表的平均值?
【发布时间】:2020-09-11 04:19:00
【问题描述】:

我有 200 个用户正在计算每种方法(行)的度量值(列)并将其保存到数据框中。我遵循了post,它使用pandas.Panel 对所有用户对每种方法的所有措施取平均值

for loop 之前用于计算用户的测量值,例如,这是针对两个用户(01

dfs = {}
for s in range(0, 2): # do the following for user0 and user1
    .
    # some commands for calculation of measurements
    .
    .
    .
    #end of the loop
    dfs[s] = pd.concat([ov_df, sd_df], axis=1)  # dataframe for user s
panel = pd.Panel(dfs)
*** TypeError: object() takes no parameters

如何分别为15 measures11 methods 取所有用户的平均值?

dfs
{0:              m1        s2       ...      ee         vd
RF              0.536819  0.698611  ...  57.144087 -55.781946
OL              0.480758  0.649341  ...  61.991170 -57.210469
LA              0.427991  0.599431  ...  67.091363 -57.026384
AP              0.466703  0.636397  ...  63.612812 -57.285542
AP2             0.467951  0.637557  ...  63.677943 -59.602584
MA              0.428375  0.599807  ...  67.073286 -56.977762
RC              0.536892  0.698672  ...  57.135469 -55.766803
DP              0.536819  0.698611  ...  57.144087 -55.781946
DC              0.537510  0.699195  ...  57.014234 -55.574017
KU              0.537032  0.698791  ...  57.111874 -55.745237
KE              0.493517  0.660879  ...  60.704082 -57.366922

[11 rows x 15 columns], 1:                  m1        s2       ...      ee         vd
RF              0.369103  0.539190  ...  61.541261 -48.183651
OL              0.334069  0.500827  ...  66.807720 -43.531795
LA              0.300838  0.462530  ...  70.741817 -39.702935
AP              0.322879  0.488146  ...  68.371827 -38.054113
AP2             0.322453  0.487659  ...  68.212097 -47.518693
MA              0.301198  0.462955  ...  70.716283 -39.436550
RC              0.369095  0.539181  ...  61.546610 -48.155079
DP              0.369103  0.539190  ...  61.541261 -48.183651
DC              0.369500  0.539613  ...  61.484330 -48.376968
KU              0.369116  0.539203  ...  61.539789 -48.176711
KE              0.341218  0.508818  ...  65.061794 -49.218448

【问题讨论】:

    标签: python pandas dataframe pandas-groupby


    【解决方案1】:

    我在this post找到了答案 仅一行命令

    df = pd.concat(dfs).mean(level=0)
    

    【讨论】:

      【解决方案2】:
      #load in dataframes. Example using 2 dataframes for person 0 (df0), and person 1 (df1)
      
      #concatenate the dataframes, switch their level, and sort to make easier
      df_combined = pd.concat([df0, df1], keys=[0,1], names=['user', 'method'])
      df_combined = df_combined.swaplevel(1,0)
      print(df_combined.sort_index())
      

      输出

                        m1        s2         ee         vd
      method user                                          
      AP     0     0.466703  0.636397  63.612812 -57.285542
             1     0.322879  0.488146  68.371827 -38.054113
      AP2    0     0.467951  0.637557  63.677943 -59.602584
             1     0.322453  0.487659  68.212097 -47.518693
      DC     0     0.537510  0.699195  57.014234 -55.574017
             1     0.369500  0.539613  61.484330 -48.376968
      DP     0     0.536819  0.698611  57.144087 -55.781946
             1     0.369103  0.539190  61.541261 -48.183651
      KE     0     0.493517  0.660879  60.704082 -57.366922
             1     0.341218  0.508818  65.061794 -49.218448
      KU     0     0.537032  0.698791  57.111874 -55.745237
             1     0.369116  0.539203  61.539789 -48.176711
      LA     0     0.427991  0.599431  67.091363 -57.026384
             1     0.300838  0.462530  70.741817 -39.702935
      MA     0     0.428375  0.599807  67.073286 -56.977762
             1     0.301198  0.462955  70.716283 -39.436550
      OL     0     0.480758  0.649341  61.991170 -57.210469
             1     0.334069  0.500827  66.807720 -43.531795
      RC     0     0.536892  0.698672  57.135469 -55.766803
             1     0.369095  0.539181  61.546610 -48.155079
      RF     0     0.536819  0.698611  57.144087 -55.781946
             1     0.369103  0.539190  61.541261 -48.183651
      
      #Average based on the method
      df_combined.groupby(level=0).mean()
      

      输出

          m1          s2          ee           vd
      method              
      AP  0.394791    0.562272    65.992319   -47.669827
      AP2 0.395202    0.562608    65.945020   -53.560638
      DC  0.453505    0.619404    59.249282   -51.975493
      DP  0.452961    0.618900    59.342674   -51.982799
      KE  0.417368    0.584848    62.882938   -53.292685
      KU  0.453074    0.618997    59.325831   -51.960974
      LA  0.364414    0.530981    68.916590   -48.364660
      MA  0.364787    0.531381    68.894785   -48.207156
      OL  0.407413    0.575084    64.399445   -50.371132
      RC  0.452994    0.618926    59.341040   -51.960941
      RF  0.452961    0.618900    59.342674   -51.982799
      

      从这里开始,如果您需要查看基于度量的平均值,可以直接进行相应调整(即使用pd.transpose

      参考其他帖子中提供的解决方案,您似乎错过了一步:

      panel=pd.panel(dfs).mean(axis=0)
      

      请注意,自版本 0.20 pandas.Panel 起已弃用此功能

      【讨论】:

      • 这显示相同的错误:panel=pd.Panel(dfs).mean(axis=0) *** TypeError: object() takes no parameters
      • 正如我所说,面板已被弃用,这可能是错误的原因 - 我的回答与您对另一篇文章的引用有关。更新答案以提供解决方案
      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2017-10-12
      • 2018-07-09
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2018-01-03
      • 2021-02-08
      相关资源
      最近更新 更多