【问题标题】:Pandas: Split df into multiple dfs based on multiple columnsPandas:根据多列将 df 拆分为多个 df
【发布时间】:2021-10-30 01:24:09
【问题描述】:

我有一个 df,我想根据“名称”和“计划”列中的值将其拆分为多个 df。对于低于 df,我希望分成 6 个 dfs,其中第 1 行和第 6 行将在同一个 df 中

df:

City    State       Name     Plan   Price
 A        CA     Star Inn     CTS    50
 B        CA      1 Inn       KVG    100
 C        IN     GS Hotel     KHA    25
 D        FL     HJ Resort    2QN    45
 E        AL     PQ Inn       POI    55
 A        CA     Star Inn     CTS    80
 A        CA     Star Inn     MNB    65

期望的输出

df1:

City    State       Name     Plan   Price
 A        CA     Star Inn     CTS    50
 A        CA     Star Inn     CTS    80

df2:

City    State       Name     Plan   Price
 B        CA      1 Inn       KVG    100

依此类推,直到 df6...

【问题讨论】:

    标签: python pandas dataframe


    【解决方案1】:

    此示例将通过 NamePlan 拆分数据帧并打印它们:

    dataframes = []
    for _, d in df.groupby(["Name", "Plan"]):
        dataframes.append(d)
    
    # print it:
    for d in dataframes:
        print(d)
        print("-" * 80)
    

    打印:

      City State   Name Plan  Price
    1    B    CA  1_Inn  KVG    100
    --------------------------------------------------------------------------------
      City State      Name Plan  Price
    2    C    IN  GS_Hotel  KHA     25
    --------------------------------------------------------------------------------
      City State       Name Plan  Price
    3    D    FL  HJ_Resort  2QN     45
    --------------------------------------------------------------------------------
      City State    Name Plan  Price
    4    E    AL  PQ_Inn  POI     55
    --------------------------------------------------------------------------------
      City State      Name Plan  Price
    0    A    CA  Star_Inn  CTS     50
    5    A    CA  Star_Inn  CTS     80
    --------------------------------------------------------------------------------
      City State      Name Plan  Price
    6    A    CA  Star_Inn  MNB     65
    --------------------------------------------------------------------------------
    

    【讨论】:

      【解决方案2】:

      在 pandas 中使用 group_by 你会得到一个 Grouper 对象:

      grouped = df.groupby(["Name","Plan"])
      

      当你迭代时,它会给你一个元组,其中第一个元素是组(在这种情况下,("Name","Plan"))和第二个元素,拆分 dfs:

      grouped = df.groupby(["Name","Plan"])
      for _, split_df in grouped:
          print(split_df)
          print("-----")
      

      会给你:

        City State   Name Plan  Price
      1    B    CA  1 Inn  KVG    100
      -----
        City State      Name Plan  Price
      2    C    IN  GS Hotel  KHA     25
      -----
        City State       Name Plan  Price
      3    D    FL  HJ Resort  2QN     45
      -----
        City State    Name Plan  Price
      4    E    AL  PQ Inn  POI     55
      -----
        City State      Name Plan  Price
      0    A    CA  Star Inn  CTS     50
      5    A    CA  Star Inn  CTS     80
      -----
        City State      Name Plan  Price
      6    A    CA  Star Inn  MNB     65
      -----
      

      【讨论】:

      • 有没有办法单独存储这些dfs?
      • 是的,您可以将它们(split_df)附加到 for 循环中的列表中,或者根据需要将它们分配给不同的变量。
      猜你喜欢
      • 1970-01-01
      • 2021-06-14
      • 2021-12-11
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2018-02-19
      • 2017-09-13
      • 2020-08-16
      相关资源
      最近更新 更多