【发布时间】:2022-01-22 08:48:32
【问题描述】:
鉴于这个df:
dim_date_id closing_type r_d variable value rolling cusum_sample sample_type
1330 1995-10-27 low 1 low 9.699377 0.039688 1 [sh_dummy_0.5, sh_dummy_1]
1331 1995-10-27 low 1 close 10.340971 0.044784 1 [sh_dummy_0.5, sh_dummy_1]
1330 1995-10-27 high 1 high 10.529675 0.062868 1 [sh_dummy_0.5, sh_dummy_1, sh_dummy_2]
1331 1995-10-27 high 1 close 10.340971 0.044784 1 [sh_dummy_0.5, sh_dummy_1, sh_dummy_2]
1330 1995-10-27 low 5 low 9.699377 0.132976 1 [sh_dummy_0.5, sh_dummy_1, sh_dummy_2]
1331 1995-10-27 low 5 close 10.340971 0.188179 1 [sh_dummy_0.5, sh_dummy_1, sh_dummy_2]
1330 1995-10-27 high 5 high 10.529675 0.184475 1 [sh_dummy_0.5, sh_dummy_1, sh_dummy_2]
我想根据variable 对其进行分组,并在colum 样本类型中创建一个嵌套字典(或者我并不关心的另一个字典)。作为输出,我想要一个看起来像这样的df
dim_date_id variable value sample_type
1330 1995-10-27 low 9.699377 {'r_d':1,'closing_type':'low','rolling':0.039688,'sample':[sh_dummy_0.5, sh_dummy_1]},
{'r_d':5,'closing_type':'low','rolling':0.132976,'sample':[sh_dummy_0.5, sh_dummy_1, sh_dummy_2]
1331 1995-10-27 close 10.340971 {'r_d':1,'closing_type':'low','rolling':0.044784,'sample':[sh_dummy_0.5, sh_dummy_1]},
{'r_d':1,'closing_type':'high','rolling':0.062868,'sample':[sh_dummy_0.5, sh_dummy_1, sh_dummy_2],
{'r_d':5,'closing_type':'low','rolling':0.188179,'sample':[sh_dummy_0.5, sh_dummy_1, sh_dummy_2],
1330 1995-10-27 high 10.529675 {'r_d':1,'closing_type':'high','rolling':0.062868,'sample':[sh_dummy_0.5, sh_dummy_1, sh_dummy_2]},
{'r_d':5,'closing_type':'high','rolling':0.184475,'sample':[sh_dummy_0.5, sh_dummy_1, sh_dummy_2]
它必须尽可能灵活,因为在 sample_type 列中有时还可以有“n”个不同的变量。
【问题讨论】:
标签: python pandas dataframe nested pandas-groupby