【问题标题】:Create a new column for a dataframe based on a complicated dictionary基于复杂字典为数据框创建新列
【发布时间】:2021-11-29 10:04:19
【问题描述】:

我有一个复杂的字典来重新映射值。我如何在 python 中实现这一点?

cols_to_check = ["ColA","ColB","ColC"]
dic_string = "{(Type 1 | Type 2) : Type dual,
       (Type 3 | Type 4) : Type many,
        ELSE: Not listed
       }"
df = pd.DataFrame(
{
        'ID': ['AB01', 'AB02', 'AB03', 'AB04', 'AB05','AB06','AB07','AB08'],
        'ColA': ["Type 1","Undef",np.nan,"Undef",
                 "Type 1", "","", "Undef"],
        'ColB': ["N","Type 2","","",
                 "Y", np.nan,"", "N"],
        'ColC': [np.nan,"Undef","Type 3",np.nan,"Undef",
                 "Undef", "","Type 2"]
})

如果它是一个简单的字典并且其中没有提到 ELSE,我可以做到。假设我可以将“dic_string”转换为以下内容:

dic = {"Type 1" : "Type dual","Type 2" : "Type dual",
   "Type 3" : "Type many", "Type 4" : "Type many",
    "ELSE": "Not listed"
   }

如何使用新列“结果”制作像这样的最终结果。如何在不硬编码 dic 内容的情况下实现这一点?

【问题讨论】:

    标签: python pandas dataframe numpy data-manipulation


    【解决方案1】:

    使用np.select:

    dic = {('Type 1', 'Type 2'): 'Type dual',
           ('Type 3', 'Type 4'): 'Type many',}
    default = 'Not listed'
    
    condlist = [df[cols_to_check].isin(k).any(axis=1) for k in dic]
    choicelist = dic.values()
    
    df['Result'] = np.select(condlist, choicelist, default)
    

    输出:

         ID    ColA    ColB    ColC      Result
    0  AB01  Type 1       N     NaN   Type dual
    1  AB02   Undef  Type 2   Undef   Type dual
    2  AB03     NaN          Type 3   Type many
    3  AB04   Undef             NaN  Not listed
    4  AB05  Type 1       Y   Undef   Type dual
    5  AB06             NaN   Undef  Not listed
    6  AB07                          Not listed
    7  AB08   Undef       N  Type 2   Type dual
    

    【讨论】:

    • 有没有办法可以使用 'dic' 代替硬编码其中的内容?
    • 您的字典不是有效的python dict
    • 如果我将 dic 转换为 dic = {"Type 1" : "Type dual","Type 2" : "Type dual", "Type 3" : "Type many", "Type 4" : "Type many", "ELSE": "Not listed" }
    • 你能看看我更新的答案吗?
    • 我已经检查过了,我会尝试像您使用的那样将 dic 转换为上述格式。
    猜你喜欢
    • 2020-10-06
    • 1970-01-01
    • 1970-01-01
    • 2022-07-06
    • 2021-10-30
    • 2017-01-01
    • 2017-08-03
    • 2018-10-07
    • 2020-11-06
    相关资源
    最近更新 更多