基于唯一计数列表的熊猫分组答案

【问题标题】：pandas grouping based on list of unique counts基于唯一计数列表的熊猫分组
【发布时间】：2022-08-13 23:38:55
【问题描述】：

所以我有以下数据，我想使用 pandas 来显示以下输出：

             MakeWheel  UpdateWheel  MakeGlass   UpdateGlass MakeChair UpdateChair ...
Toyota.         1            1           1            1          0         0
Mercedes.       2            0           0            0          0         0
Hyndai.         8            0           0            0          0         4
Jeep.           0            0           0            0          2         2    
...

分组基于键是否匹配，例如 UpdateChair 或 MakeWheel。如果Mercedes 我们将它们分组，因为MakeWheel 是相同的，所以我们只是合并它们并计算两个列表中的项目，如果项目相同，也包括它们，例如在MakeChair 的情况下，虽然right 和 left 是列表中的相同项目，我们会将它们全部计数，所以我们得到 8

显示彼此相邻的两个主要关键字（Make、Update）

cars_dict 是

{
    \"Toyota\": [
        {
            \"MakeWheel\": [
                \"left-wheel\"
            ]
        },
        {
            \"UpdateWheel\": [
                \"right-wheel\"
            ]
        },
        {
            \"MakeGlass\": [
                \"right-wheel\"
            ]
        },
        {
            \"UpdateGlass\": [
                \"right-wheel\"
            ]
        }
    ],
    \"Mercedes\": [
        {
            \"MakeWheel\": [
                \"left-and-right\"
            ]
        },
        {
            \"MakeWheel\": [
                \"only-right\"
            ]
        }
    ],
    \"Hyndai\": [
        {
            \"MakeChair\": [
                \"right\",
                \"left\"
            ]
        },
        {
            \"MakeChair\": [
                \"right\",
                \"left\"
            ]
        },
        {
            \"MakeChair\": [
                \"right\",
                \"left\"
            ]
        },
        {
            \"MakeChair\": [
                \"right\",
                \"left\"
            ]
        },
        {
            \"UpdateChair\": [
                \"right\",
                \"left\"
            ]
        },
        {
            \"UpdateChair\": [
                \"right\",
                \"left\"
            ]
        }
    ],
    \"Jeep\": [
        {
            \"MakeChair\": [
                \"left-and-right\",
                \"back-only\"
            ]
        },
        {
            \"UpdateChair\": [
                \"right-and-left\",
                \"left\"
            ]
        }
    ]
}

出于某种原因，我得到了错误的输出。

代码：

 r_list = []
        for car_k, car_v in cars_dict.items():
            for i in car_v:
                r = {k: len(v) for k, v in i.items()}
                r_list.append({car_k: r})

        pd_list = []
        for r in r_list:
            pd.set_option(\'display.max_seq_items\', None)
            pd.set_option(\'display.max_colwidth\', 500)
            pd.set_option(\'expand_frame_repr\', True)
            pd.options.display.float_format = \'{:,.0f}\'.format
            df = pd.DataFrame.from_dict(r)
            pd_list.append(df)
        df = pd.concat(pd_list, axis=0)
        output = df.transpose().fillna(0)
        print(output)

标签： python-3.x pandas data-structures

【解决方案1】：

使用您提供的初始字典，这是一种方法：

# Import data
df = pd.DataFrame.from_dict(cars_dict, orient="index")

# Transform dicts in Series and append as new columns
df = pd.concat(
    [df[col].apply(lambda x: pd.Series(x, dtype="object")) for col in df.columns]
).dropna(how="all")

# Deal with list of values
for col in df.columns:
    df = df.explode(column=col)

# Count values and cleanup
df = (
    df.groupby(df.index)
    .count()
    .reindex(
        index=["Toyota", "Mercedes", "Hyndai", "Jeep"],
        columns=[
            "MakeWheel",
            "UpdateWheel",
            "MakeGlass",
            "UpdateGlass",
            "MakeChair",
            "UpdateChair",
        ],
    )
)

print(df)
# Output
          MakeWheel  UpdateWheel  MakeGlass  UpdateGlass  MakeChair  UpdateChair
Toyota            1            1          1            1          0            0
Mercedes          2            0          0            0          0            0
Hyndai            0            0          0            0          8            4
Jeep              0            0          0            0          2            2

【讨论】：