【问题标题】:how to create multiple dataframes from existing dataframe based on condition in Python如何根据Python中的条件从现有数据框创建多个数据框
【发布时间】:2023-03-21 13:50:03
【问题描述】:

我有一个如下所示的数据框。我想根据列 ID 从这个数据框创建多个数据框。

df = pd.DataFrame(results)
print(df)

结果是:

       ID  NAME    COLOR
    0  01   ABC      RED                               
    1  01   ABC      ORANGE                  
    2  01   ABC      WHITE   
    3  02   DEF      RED
    4  02   DEF      PURPLE
    5  02   DEF      GREEN
    6  02   DEF      ORANGE
    7  02   DEF      BLACK
    8  03   GHI      RED
    9  03   GHI      BLACK
   10  03   GHI      GREEN
   11  03   GHI      ORANGE
   12  04   JKL      RED

多个Dataframes应该如下图所示:我无法将其放入python代码中,请帮助。

           ID  NAME    COLOR
        0  01   ABC      RED                               
        1  01   ABC      ORANGE                  
        2  01   ABC      WHITE  



          ID  NAME    COLOR
       0  02   DEF      RED
       1  02   DEF      PURPLE
       2  02   DEF      GREEN
       3  02   DEF      ORANGE
       4  02   DEF      BLACK

          ID  NAME    COLOR
       0  03   GHI      RED
       1  03   GHI      BLACK
       2  03   GHI      GREEN
       3  03   GHI      ORANGE

           ID  NAME    COLOR
       0   04   JKL      RED 

【问题讨论】:

    标签: python pandas


    【解决方案1】:

    你可以这样做:

    data_dict={'df'+str(i): grp for i , grp in df.groupby('ID')}
    

    提供字典:

    {'df1':    ID NAME   COLOR
     0   1  ABC     RED
     1   1  ABC  ORANGE
     2   1  ABC   WHITE, 'df2':    ID NAME   COLOR
     3   2  DEF     RED
     4   2  DEF  PURPLE
     5   2  DEF   GREEN
     6   2  DEF  ORANGE
     7   2  DEF   BLACK, 'df3':     ID NAME   COLOR
     8    3  GHI     RED
     9    3  GHI   BLACK
     10   3  GHI   GREEN
     11   3  GHI  ORANGE, 'df4':     ID NAME COLOR
     12   4  JKL   RED}
    

    现在只需调用每个键即可访问每组 ID,

    print(data_dict['df2'])
    
       ID NAME   COLOR
    3   2  DEF     RED
    4   2  DEF  PURPLE
    5   2  DEF   GREEN
    6   2  DEF  ORANGE
    7   2  DEF   BLACK
    

    【讨论】:

      【解决方案2】:

      您必须按“NAME”列过滤

      df_EDF = df[df.NAME == "EDF"]
      df_GHI = df[df.NAME == "GHI"]
      

      抱歉,硬编码解决方案: 这是我的另一个解决方案:

      import numpy as np 
      import pandas as pd 
      
      
      d = {'NAME': ["ABC", "ABC","ABC","GHI","GHI"], 'VALUE': [3, 4,5,6,7]}
      df = pd.DataFrame(data=d)
      
      # Get all unique names
      cat = np.unique(df.NAME)
      
      # create empty list of dataframes 
      listOfDf = []
      
      # for each unique name, create df_i with df filter by name, and append the list 
      for i in cat:
          df_i = df[df.NAME == i].reset_index(drop = True)
          listOfDf.append(df_i)
      
      # now you have a list of dataframe and can work with each element of the list 
          # as dataframe
      
      print(listOfDf)
      
      [  NAME  VALUE
      0  ABC      3
      1  ABC      4
      2  ABC      5,   NAME  VALUE
      0  GHI      6
      1  GHI      7]
      
      
      for x in range(len(listOfDf)):
          print(listOfDf[x])
          print("------")
      
        NAME  VALUE
      0  ABC      3
      1  ABC      4
      2  ABC      5
      ------
        NAME  VALUE
      0  GHI      6
      1  GHI      7
      ------
      

      【讨论】:

        【解决方案3】:

        你可以试试这个:

        import pandas as pd
        data= {'ID':[1,1,1,2,2,2,3,3,3,4], 'NAME':['ABC','ABC','ABC','DEF','DEF','DEF','GHI','GHI','GHI','JKL']}  
        df = pd.DataFrame(data=data)
        

        解决方案 1

            myList=[]
            for id, df_id in df.groupby('ID'):
                print(df_id)
        `       myList.append(df_id)
                Result:
                 ID NAME
                0   1  ABC
                1   1  ABC
                2   1  ABC
                   ID NAME
                3   2  DEF
                4   2  DEF
                5   2  DEF
                   ID NAME
                6   3  GHI
                7   3  GHI
                8   3  GHI
                   ID NAME
                9   4  JKL
        

        您可以访问不同的数据框,例如 myList[2]

           ID   NAME
        6   3   GHI
        7   3   GHI
        8   3   GHI
        

        解决方案 2:

        {k: v for k, v in df.groupby('ID')}
        
            Result:
            {1:    ID NAME
             0   1  ABC
             1   1  ABC
             2   1  ABC, 2:    ID NAME
             3   2  DEF
             4   2  DEF
             5   2  DEF, 3:    ID NAME
             6   3  GHI
             7   3  GHI
             8   3  GHI, 4:    ID NAME
             9   4  JKL}
        

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 2022-01-18
          • 2021-12-09
          • 1970-01-01
          • 2020-07-09
          • 2017-03-20
          • 2021-10-30
          • 2020-12-14
          • 1970-01-01
          相关资源
          最近更新 更多