【问题标题】:pandas grouping based on list of unique counts基于唯一计数列表的熊猫分组
【发布时间】:2022-08-13 23:38:55
【问题描述】:

所以我有以下数据,我想使用 pandas 来显示以下输出:

             MakeWheel  UpdateWheel  MakeGlass   UpdateGlass MakeChair UpdateChair ...
Toyota.         1            1           1            1          0         0
Mercedes.       2            0           0            0          0         0
Hyndai.         8            0           0            0          0         4
Jeep.           0            0           0            0          2         2    
...

分组基于键是否匹配,例如 UpdateChair 或 MakeWheel。如果Mercedes 我们将它们分组,因为MakeWheel 是相同的,所以我们只是合并它们并计算两个列表中的项目,如果项目相同,也包括它们,例如在MakeChair 的情况下,虽然rightleft 是列表中的相同项目,我们会将它们全部计数,所以我们得到 8

显示彼此相邻的两个主要关键字(Make、Update)

cars_dict

{
    \"Toyota\": [
        {
            \"MakeWheel\": [
                \"left-wheel\"
            ]
        },
        {
            \"UpdateWheel\": [
                \"right-wheel\"
            ]
        },
        {
            \"MakeGlass\": [
                \"right-wheel\"
            ]
        },
        {
            \"UpdateGlass\": [
                \"right-wheel\"
            ]
        }
    ],
    \"Mercedes\": [
        {
            \"MakeWheel\": [
                \"left-and-right\"
            ]
        },
        {
            \"MakeWheel\": [
                \"only-right\"
            ]
        }
    ],
    \"Hyndai\": [
        {
            \"MakeChair\": [
                \"right\",
                \"left\"
            ]
        },
        {
            \"MakeChair\": [
                \"right\",
                \"left\"
            ]
        },
        {
            \"MakeChair\": [
                \"right\",
                \"left\"
            ]
        },
        {
            \"MakeChair\": [
                \"right\",
                \"left\"
            ]
        },
        {
            \"UpdateChair\": [
                \"right\",
                \"left\"
            ]
        },
        {
            \"UpdateChair\": [
                \"right\",
                \"left\"
            ]
        }
    ],
    \"Jeep\": [
        {
            \"MakeChair\": [
                \"left-and-right\",
                \"back-only\"
            ]
        },
        {
            \"UpdateChair\": [
                \"right-and-left\",
                \"left\"
            ]
        }
    ]
}

出于某种原因,我得到了错误的输出。

代码:

 r_list = []
        for car_k, car_v in cars_dict.items():
            for i in car_v:
                r = {k: len(v) for k, v in i.items()}
                r_list.append({car_k: r})

        pd_list = []
        for r in r_list:
            pd.set_option(\'display.max_seq_items\', None)
            pd.set_option(\'display.max_colwidth\', 500)
            pd.set_option(\'expand_frame_repr\', True)
            pd.options.display.float_format = \'{:,.0f}\'.format
            df = pd.DataFrame.from_dict(r)
            pd_list.append(df)
        df = pd.concat(pd_list, axis=0)
        output = df.transpose().fillna(0)
        print(output)

    标签: python-3.x pandas data-structures


    【解决方案1】:

    使用您提供的初始字典,这是一种方法:

    # Import data
    df = pd.DataFrame.from_dict(cars_dict, orient="index")
    
    # Transform dicts in Series and append as new columns
    df = pd.concat(
        [df[col].apply(lambda x: pd.Series(x, dtype="object")) for col in df.columns]
    ).dropna(how="all")
    
    # Deal with list of values
    for col in df.columns:
        df = df.explode(column=col)
    
    # Count values and cleanup
    df = (
        df.groupby(df.index)
        .count()
        .reindex(
            index=["Toyota", "Mercedes", "Hyndai", "Jeep"],
            columns=[
                "MakeWheel",
                "UpdateWheel",
                "MakeGlass",
                "UpdateGlass",
                "MakeChair",
                "UpdateChair",
            ],
        )
    )
    
    print(df)
    # Output
              MakeWheel  UpdateWheel  MakeGlass  UpdateGlass  MakeChair  UpdateChair
    Toyota            1            1          1            1          0            0
    Mercedes          2            0          0            0          0            0
    Hyndai            0            0          0            0          8            4
    Jeep              0            0          0            0          2            2
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2019-02-04
      • 2021-10-05
      • 2020-10-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多