【问题标题】:Aggregate the given data frame based on the specific conditions in pandas根据 pandas 中的特定条件聚合给定的数据框
【发布时间】:2021-06-01 22:57:59
【问题描述】:

我有一个如下所示的 df df:

ID    Number_of_Cars    Age_in_days   Total_amount   Total_N     Type
1     2                 100           10000          100         A
2     5                 10            1000           2           B
3     1                 1000          1000           200         B
4     1                 20            0              0           C
5     3                 1000          100000         20          A
6     6                 100           10000          20          C
7     4                 200           10000          200         A

从上面的df我想准备df1如下图

df1:

ID    Avg_Monthly_Amount      Avg_Monthly_N  Type
1     3000                    30             A
2     3000                    6              B
3     30                      6              B
4     0                       0              C
5     3000                    0.6            A
6     3000                    6              C
7     1500                    30             A

解释:

Avg_Monthly_Amount = Avg monthly amount 
Avg_Monthly_N = Avg monthly N

为了准备 df1,我尝试了下面的代码

df['Avg_Monthly_Amount'] = df['Total_amount'] / df['Age_in_days'] * 30
df['Avg_Monthly_N'] = df['Total_N'] / df['Age_in_days'] * 30

从 df 和 df1(或单独的 df)我想将以下数据框准备为 df2

我无法编写正确的代码以在 df2 以下生成

解释:

在类型级别聚合上述数字

例子:

There are 3 customers (ID = 1, 5, 7) with Type = A, hence for Type = A, Number_Of_Type  = 3
Avg_Cars for Type = A, is (2+3+4)/3 = 3
Avg_age_in_years for Type = A is ((100+1000+200)/3)/365
Avg_amount_monthly for Type = A is Mean of Average_Monthly_Amount in for type = A in df1 
Avg_N_monthly for Type = A is Mean of Avg_Monthly_N in for type = A in df1 

最终预期输出 (df2)

Type  Number_Of_Type  Avg_Cars     Avg_age_in_years   Avg_amount_monthly    Avg_N_monthly
A     3               3            1.19               2500                  20.2
B     2               3            1.38               1515                  6
C     2               3.5          0.16               1500                  3

【问题讨论】:

    标签: python-3.x pandas pandas-groupby


    【解决方案1】:

    不要从原始数据帧df 中准备其他名为df1df

    你的数据框df:-

    ID    Number_of_Cars    Age_in_days   Total_amount   Total_N     Type
    1     2                 100           10000          100         A
    2     5                 10            1000           2           B
    3     1                 1000          1000           200         B
    4     1                 20            0              0           C
    5     3                 1000          100000         20          A
    6     6                 100           10000          20          C
    7     4                 200           10000          200         A
    

    在你创建/导入df之后:-

    df['Avg_Monthly_Amount'] = df['Total_amount'] / df['Age_in_days'] * 30
    df['Avg_Monthly_N'] = df['Total_N'] / df['Age_in_days'] * 30
    df['Age_in_year']=df['Age_in_days']/365
    

    然后:-

    df2=df.groupby('Type').agg({'Type':'count','Number_of_Cars':'mean','Age_in_year':'mean','Avg_Monthly_Amount':'mean','Avg_Monthly_N':'mean'}).rename(columns={'Type':'Number_Of_Type'})
    

    现在,如果您打印或写 df2(如果您使用的是 jupyter notebook),那么您将获得所需的输出

    输出:-

        Number_Of_Type  Number_of_Cars  Age_in_year     Avg_Monthly_Amount  Avg_Monthly_N
    Type                    
    A             3           3.0        1.187215          2500.0             20.2
    B             2           3.0        1.383562          1515.0             6.0
    C             2           3.5        0.164384          1500.0             3.0
    

    【讨论】:

    • 你能分享你的输出吗
    • 当然......我编辑了我的答案,看看.......顺便说一句,如果这个解决方案对您有帮助,那么请随时接受答案
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-12-09
    • 2022-11-26
    • 1970-01-01
    • 1970-01-01
    • 2020-04-24
    相关资源
    最近更新 更多