【问题标题】:Pandas- Fill a dictionary with dataframes depending on a switchPandas-根据开关用数据框填充字典
【发布时间】:2022-01-02 06:35:44
【问题描述】:

背景:我有一些数据帧可以通过开关打开或关闭。我想用每个打开的数据框填充字典。然后我希望能够遍历数据框。

问题:我不知道如何动态构建我的字典以仅在打开开关时包含数据帧。

我的尝试:

import pandas as pd

sw_a = True
sw_b = False
sw_c = True

a = pd.DataFrame({'IDs':[1234,5346,1234,8793,8793],
                   'Cost':[1.1,1.2,1.3,1.4,1.5],
                    'Names':['APPLE','Orange','STRAWBERRY','Grape','Blue']}) if sw_a == True else []
b = pd.DataFrame({'IDs':[1,2],
                   'Cost':[1.1,1.2],
                    'Names':['APPLE1','Blue1']}) if sw_b == True else []
c = pd.DataFrame({'IDs':[12],
                  'Cost':[1.5],
                    'Names':['APPLE2']}) if sw_c == True else []
total = {"first":a,"second":b,"third":c}

for df in total:
    temp_cost = sum(total[df]['Cost'])
    print(f'The number of fruits for {df} is {len(total[df])} and the cost is {temp_cost}')

上述方法不起作用,因为它始终包含数据帧,如果开关关闭,它是一个字符串,而不是完全排除。

【问题讨论】:

    标签: python pandas dataframe dictionary for-loop


    【解决方案1】:

    我的设置与你的类似,但我不关心每个数据帧分配上的开关:

    import pandas as pd
    
    sw_a = True
    
    sw_b = False
    sw_c = True
    
    a = pd.DataFrame({'IDs':[1234,5346,1234,8793,8793],
                       'Cost':[1.1,1.2,1.3,1.4,1.5],
                        'Names':['APPLE','Orange','STRAWBERRY','Grape','Blue']})
    b = pd.DataFrame({'IDs':[1,2],
                       'Cost':[1.1,1.2],
                        'Names':['APPLE1','Blue1']})
    c = pd.DataFrame({'IDs':[12],
                      'Cost':[1.5],
                        'Names':['APPLE2']})
    
    total = {"first":a,"second":b,"third":c} # don't worry about the switches yet.
    

    我们现在才过滤:

    list_switches = [sw_a, sw_b, sw_c] # the switches! finally!
    total_filtered = {tup[1]:total[tup[1]] for tup in zip(list_switches, total) if tup[0]}
    

    照你做的继续。

    for df in total_filtered:
        temp_cost = sum(total[df]['Cost'])
        print(f'The number of fruits for {df} is {len(total[df])} and the cost is {temp_cost}')
    

    输出:

    编辑 您可以对zip 功能稍感兴趣,例如,如果您正在动态构建数据帧、数据帧名称和开关的列表,并且可以确保它们的长度始终相同,您可以执行以下操作:

    # pretend these three lists are coming from somewhere else and can have variable length, rather than being hard-coded.
    list_dfs = [a,b,c]
    list_switches = [sw_a, sw_b, sw_c]
    list_names = ["first", "second", "third"]
    
    # use a zip object over the three lists.
    zipped = zip(list_dfs, list_switches, list_names)
    total = {tup[2] : tup[0] for tup in zipped if tup[1]}
    
    for df in total:
        temp_cost = sum(total[df]['Cost'])
        print(f'The number of fruits for {df} is {len(total[df])} and the cost is {temp_cost}')
    

    【讨论】:

    • 这很好用,但我不确定这条线是否有效... total = {tup[2] : tup[0] for tup in zipped if tup[1]}
    • @JonathanHay - 这是对 zip 对象的 dict 理解。你熟悉这些概念吗?感谢您的支持和接受,顺便说一句。
    【解决方案2】:

    考虑这样的事情。

    sw_a = True
    sw_b = False
    sw_c = True
    
    a = pd.DataFrame({'IDs':[1234,5346,1234,8793,8793],
                       'Cost':[1.1,1.2,1.3,1.4,1.5],
                        'Names':['APPLE','Orange','STRAWBERRY','Grape','Blue']})
    b = pd.DataFrame({'IDs':[1,2],
                       'Cost':[1.1,1.2],
                        'Names':['APPLE1','Blue1']})
    c = pd.DataFrame({'IDs':[12],
                      'Cost':[1.5],
                        'Names':['APPLE2']})
    
    total = {}
    if sw_a == True:
        total['sw_a'] = a
    if sw_b == True:
        total['sw_b'] = b
    if sw_c == True:
        total['sw_c'] = c
    print(total)
    
    for df in total:
        temp_cost = sum(total[df]['Cost'])
        print(f'The number of fruits for {df} is {len(total[df])} and the cost is {temp_cost}')
    
    The number of fruits for sw_a is 5 and the cost is 6.5
    The number of fruits for sw_c is 1 and the cost is 1.5
    

    【讨论】:

      猜你喜欢
      • 2018-09-27
      • 2023-03-19
      • 2019-12-09
      • 2022-06-13
      • 2020-11-26
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2022-11-16
      相关资源
      最近更新 更多